Running Totals and Force Unwraps

Posted on November 27, 2014November 27, 2014 by airspeedvelocity

Recently, while implementing my game of immutable snake, I needed to write a function that produced an array of running totals.¹ Since creating an array with reduce is all the rage these days, my first take was like this:

let start = 0
let array = [1,2,3]
let runningTotal = reduce(array, [start]) { $0 + [$0.last! + $1] }
// produces [0,1,3,6]

As you can see, that code snippet contains a force unwrap. It’s perfectly safe – you can see by inspection there’s no possible way that the array could ever be empty, so .last will never be nil. Well, except for that time in the future where I decide I don’t want the initial value at the start of the array after all, and so change reduce’s initial value to [], and kerplowie.

Obviously that’d be easy to catch in a test, and you’d probably spot the ! when making the change. But maybe there are a few lines between the change and the unwrap. Maybe it’s dependent on a variable rather than a literal. In Swift, a ! should be like a traffic Stop sign. Stop for a minute and look both ways very carefully or you might get run over by a truck. Do you really need it? Is it definitely safe, no shadow of a doubt?

The best outcome from this pause for thought is another, equally valid, equally readable, way to solve the problem that removes the need for the !.

For example, take the following snippet from the Swift section of John Siracusa’s Yosemite review:

let ages = [
    "Tim":  53,  "Angela": 54,  "Craig":   44,
    "Jony": 47,  "Chris":  37,  "Michael": 34,
]
let people = sorted(ages.keys, <).filter { ages[$0]! < 50 }
println(people)

There’s another way to write this that obviates the need for the ! completely:²

// dictionary's implementation of SequenceType (which is what filter takes)
// results in iterating over all the (key,value) pairs
let people = filter(ages) { $0.1 < 50 }.map { $0.0 }.sorted(<)

// or, if you find $0.1 ugly/unreadable:
let people = filter(ages) { (k,v) in v < 50 }.map { (k,v) in k }.sorted(<)

Hopefully just as concise, but no need for the force unwrap.

I couldn’t spot a similar rewrite for my runningTotal calculation, so I took the lazy option and threw it out as a question on twitter.³

First up, suggestions of using ?? or if let instead of !:

let runningTotal = reduce(array, [start]) { $0 + [($0.last ?? 0) + $1] }

let runningTotal = reduce(array, [start]) { 
    if let val = $0.last {
        $0 + [$0.last! ?? 0 + $1]
    }
    else {
        // what to put here?
    }
}

These seem safer – but I feel like that safety can be an illusion. You know that thenil can never, ever, happen in the code as it stands. So what’s the point? Well, to kick in if you ever make a change by mistake that means you do force-unwrap a nil. But if that ever happened, you would have a bug, and sweeping it under the carpet and silently continuing could be just as bad. So perhaps what should go in that else clause is an assert, just like you’d get with the force-unwrap. In which case you’re not much better off.

How about eliminating the call to last, by passing the last value along as you go?

let runningTotal = reduce(a, (start,[start])) { ($0.0 + $1, $0.1 + [$0.0 + $1]) }.1

This version has no force-unwrap, but it’s pretty nasty to read as a one-liner. And the addition is being done twice so that could cause a performance problem for something more complex than an Int.

Speaking of performance, how about a version that uses map instead of reduce?

var runningValue = start
let runningTotal = [start] + map(a) {
    runningValue += $0
    return runningValue
}

“Ick, var”, you say. But you might change your mind if you ran a benchmark. This version screams ahead of the reduce version in terms of performance for arrays larger than a handful of elements. This is not surprising – Swift arrays aren’t lists. Concatenating two arrays is, in the absence of some kind of fancy optimization, O(n), and you’re doing that n times. So the innocent-looking reduce(array,[],+) version of flatmap is quadratic, so maybe best left as an educational example rather than something you actually do.

Anyway, if you must use a mutable variable, you can make yourself feel better by putting it in a generic function you know is safe and can use again and again. A function to produce a list of running totals is popular enough to find a place in plenty of standard libraries.

In Haskell, it’s called scanl. The Typelift Basis library has a very attractive recursive definition (which uses some other functions defined in Typelift, but you can probably get the idea):

public func scanl<B, A>(f : B -> A -> B) -> B -> [A] -> [B] {
    return { q in { ls in
        switch destruct(ls) {
            case .Empty:
                return q <| []
            case .Destructure(let x, let xs):
                return q <| scanl(f)(f(q)(x))(xs)
        }
    } }
}

As pretty as this is, relying on this kind of recursion in Swift as of now is probably also going to lead to performance problems. So here’s a more imperative implementation. In keeping with similar naming for reduce, I’ll call it accumulate, which is the name it has in Python:

// A version that returns any type that adheres so ExtensibleCollectionType
func accumulate
  <S: SequenceType, C: ExtensibleCollectionType>
  (source: S, var initial: C.Generator.Element, combine: (C.Generator.Element, S.Generator.Element) -> C.Generator.Element)
  -> C {
    var result = C()
    result.append(initial)
    for x in source {
        initial = combine(initial, x)
        result.append(initial)
    }
    return result
}

// And a default version that returns an Array, as with map
func accumulate<S: SequenceType, U>
  (source: S, initial: U, combine: (U, S.Generator.Element)->U)
  -> [U] {
    return accumulate(source, initial, combine)
}

So there you go. I’ve started putting the re-useable functions found on this blog up in a library on github, SwooshKit, that you might want to take a look at.

The actual code was to compute the coordinates of the snake’s body from the head through the tail. ↩
This also has the benefit of forcing you to put the sort at the end, where it’s sorting a filtered list which ought to be quicker. Unfortunately it introduces a different performance problem, can you spot it? ↩
Thanks to @nnnnnnnn, @jl_hfl, @CodaFi_, @rob_rix, @inthehands and @maxbucknell for all their helpful replies. ↩

Which function does Swift call? Part 6: Tinkering with priorities

Posted on November 16, 2014April 5, 2015 by airspeedvelocity

This is part of a series of posts on how Swift resolves ambiguity in overloaded functions. You can find Part 1 here. In the previous article, we looked at why you get a Range from the ... by default. The answer? Because Range has some extra validation that relies on the ... input being comparable, and this makes the Range version more specific.

Defaulting to Neither

So, suppose you didn’t want Range to be the default, but still wanted to benefit from this extra validation?

If your goal was to keep the ambiguity, and force the user to be explicit about whether they wanted a Range or a ClosedInterval, you could declare the following:

func ...
  <T: Comparable where T: ForwardIndexType>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

Now, unless you specify explicitly what type you want, you’ll get an ambiguous usage error when you use the ... operator.

(It might make you a bit nervous to see ForwardIndexType in the declaration of a function for creating intervals, given intervals have nothing to do wth indices. But don’t worry, it’s really only there to carve out an identical domain to the equivalent Range function to force the ambiguity for the same set of possible inputs.)

Defaulting to ClosedInterval

What if you actually wanted ClosedInterval to be the default? This is a little trickier.

If you just wanted to cover the most common case, integers, then you could do it like so:

func ...<T: IntegerType>(start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

// x will now be a ClosedInterval
let x = 1...5

This works because IntegerType conforms to RandomAccessIndex (which eventually comforms to ForwardIndexType).

Now, this would be a very specific carve-out for integers. You could leave it there, because integers are a special case, being as how they’re so fundamental and it’s a bit odd that they’re defined as an index type too.

But If you wanted an interval for all other type that are both comparable and an index, you’d need to handle each one on a case-by-case basis. For example, string indices are comparable:

// note, this doesn’t even have to be generic, since Strings aren’t generic
func ...(start: String.Index, end: String.Index) -> ClosedInterval<String.Index> {
    return ClosedInterval(start, end)
}

let s = "hello"

// y will now be a ClosedInterval
let y = s.startIndex...s.endIndex

Two problems become apparent with this.

First, this gets real old real quickly. Every time you create a new index type you have implement a whole new ... function for it.

Second, even then it’s still out of your control. You want other people to be able to create new index and comparable types and use them with ranges and intervals, and they aren’t necessarily going to do this. So instead of a nice predictable “you get a range by default”, we now have “you’ll probably get an interval, so long as the type has the appropriate ... overload”. That doesnt’t sound good at all.

Your best bet at this point would be to force anyone who wants to their type to be useable with an interval to tag it with a particular protocol. This is what stride does – types have to conform to the Strideable protocol. So let’s try defining an equivalent for ClosedInterval. We’ll call it, uhm, Intervalable.

protocol Intervalable: Equatable { }

extension Int: Intervalable { }

extension String.Index: Intervalable { }

// The following definintions of ... would need
// to REPLACE the current definitions.  So you
// can only do this if you're the author of
// ClosedInterval

func ...<T: Intervalable>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

func ...<T: Intervalable where T: ForwardIndexType>
  (start: T, end: T) -> ClosedInterval<T> {
    return ClosedInterval(start, end)
}

// x will be a ClosedInterval
let x = 1...5

let s = "hello"

// y will be a ClosedInterval
let y = s.startIndex...s.endIndex

Of course, this is only something the author of the original type can really implement, not something you can do yourself if you personally prefer the default to be the other way around.

And it’s a bit heavy-handed to force every type to conform to a protocol when really all it needs is to be comparable for the interval to work. But I can’t think of another option (even for the implementor) right now.¹ If anyone else can think of a good solution, let me know.

except possible tinkering with the Comparable hierarchy, which is even more restricting since only Apple could do that. ↩

Which function does Swift call? Part 5: Range vs Interval

Posted on November 10, 2014November 10, 2014 by airspeedvelocity

This is part of a series of posts on how Swift resolves ambiguity in overloaded functions. You can find Part 1 here. In the previous article, we looked at how generics are handled.

So finally…

Now that we’ve covered how generics fit in, we can finally answer the original question – how come you get a Range back from 1...5 by default? To recap the sample code:

// r will be of type Range<Int>
let r = 1...5

// if you want a ClosedInterval<Int>, you
// have to be explicit:
let integer_interval: ClosedInterval = 1...5

// if you use doubles though, you don’t need
// to declare the type explicitly.
// floatingpoint_interval is a ClosedInterval<Double>
let floatingpoint_interval = 1.0...5.0

The infix ... function is defined three times in the standard library, with the following signatures:

func ...<T : Comparable>
  (start: T, end: T) -> ClosedInterval<T>

func ...<Pos : ForwardIndexType>
  (minimum: Pos, maximum: Pos) -> Range<Pos>

func ...<Pos : ForwardIndexType where Pos : Comparable>
  (start: Pos, end: Pos) -> Range<Pos>

The first two are pretty similar. They are both generic, and have one generic placeholder that is constrained by a single protocol. There’s nothing favouring one of the two constraints – Comparable and ForwardIndexType both descend from Equatable, but neither is a descendent of the other. Left as just those two functions, you’d get an ambiguous call error.

It’s the third function above that is what makes Swift pick Range over ClosedInterval. In it, Pos is constrained not just by ForwardIndexType but also by Comparable. Constraining by two protocols wins over one, so this is the overload that’s picked.

Well, after 4 long lead-up posts, that was a bit of an anticlimax, eh?

So let’s ask another question – why is that last function there?

One possible explanation: the version with two protocol constrains was implemented purely to break the tie, since ambiguous call errors are annoying. Creating a Range doesn’t actually require comparable, but adding the comparable requirement has the effect of favouring ranges over intervals. If this was the desired outcome, that could be all the extra ... is there for.

But that’s probably not why. There’s another reason why there’s an overload that requires Comparable.

Factory Functions

A recurring pattern in the Swift standard library is to use free functions as factories to construct different types depending on the function name or arguments.

For example, the lazy function is defined 4 times for 4 different argument types: once each for random-access, bidirectional, and forward collections; and once for sequences.¹ Each one returns a different matching lazy type (e.g. LazyRandomAccessCollection). When you call lazy, the compiler picks the most specific match (in the order just given). When you combine this with type inference, you get the most powerful type available declared for you, all determined at compile time.

// a will be a LazyRandomAccessCollection
// since arrays are random access
let a = lazy([1,2,3,4])

// s will be a LazyBidirectionalCollection,
// since strings can't be indexed randomly
let s = lazy("hello")

// r will be a LazySequence, since StrideTo
// isn't a collection, just a sequence
let r = lazy(stride(from: 1, to: 8, by: 2))

// there's nothing stopping you declaring these
// lazy objects manually directly:
let l = LazyRandomAccessCollection([1,2,3,4])

Each version of lazy might not even do anything other than initialize and return the relevant type – it just makes declaring different lazy types more convenient than giving the relevant class name in full. All this is made possible by the overloading priorities described in this series. Most everyday users of lazy don’t need to know any of this though – they just use the function and it does what you’d intuitively guess was the “right” thing.

In some cases, which factory function to call is controlled more directly by the caller. For example, the stride function creates either a StrideTo or a StrideThrough depending on whether the named middle argument is to: or through:.

// x will be a StrideThrough
let x = stride(from: 1, through: 10, by: 5)

// y will be a StrideTo
let y = stride(from: 1, to: 10, by: 5)

In the case of stride, this is the only way you can construct these objects, as their initializers are private (which means stride can call them since it’s declared inside Swift, but you can’t).

Using Factory Functions to Perform Validation

For Range, the ... operator performs two extra tasks on top of constructing the return value. First, if the input supports comparators, it can validate that the begin and end arguments aren’t inverted. If they are, you get a runtime assertion. You only get this validation when using the ... operator. If you try and declare a range like this: Range(start: 5, end: 1), it’ll work.²

By the way, you’ll even get this check with String ranges. Swift string indices aren’t random-access (because of variable-length characters ), but they are comparable. While you can’t know how many characters are between two indices, you can know that one index comes before the another.

The other purpose of the ... function is to normalize closed ranges into half-open ranges. Range is kind of the opposite way around to ClosedInterval and HalfOpenInterval. There is no closed version of Range, ranges are always half-open. When you call ..., it increments the second parameter to make it equivalent to a half-open range. If you type 1...5 into a playground, you’ll see what is actually created is displayed as 1..<6.

This is why you'll get a runtime assertion if you ever try and construct a closed range through the end index of a string, even though in theory it could be a valid thing to do:

let s = "hello"
let start = s.startIndex
let end = s.endIndex

// fatal error: can not increment endIndex
let r = start...end

// same as if you did this:
end.successor()

Could you put this logic into Range.init? In the case of the half-open conversion, maybe you could, by having two versions of init with to: and through: arguments. But it's cleaner to do it using operators.

(You could maybe say the same about stride, but then you'd need custom ternary operators.³)

But in the case of the inversion check that uses comparable, you'd have to make a generic version of init, maybe something like this:

extension Range {
    init<C: Comparable>(start: C, end: C) {
        assert(start <= end, "Can't form Range with end < start")
        self.init(start: start, end: end)
    }
}

Well, this will compile but it won't do what you want. In fact your init will never be called, even if you pass in a comparable type. Why? Because it's generic, and the regular version of Range.init is not. As we've seen previously, generic functions never get called when there's a possible non-generic overload.

“Hey, no wait, the current Range.init is so generic!” you complain. “Look:”

// Excerpt from the Swift definition of Range
struct Range : Equatable ...etc {
    /// Construct a range with `startIndex == start` and `endIndex ==
    /// end`.
    init(start: T, end: T)
}

“See? T is a generic placeholder! And we’re puting more constraints on our function’s placeholder, so it should be the one that’s called.”

OK yes, T is a generic placeholder. But it’s not a placeholder for the function. It’s a placeholder for the struct. In the context of the function, T is already fixed in place as a specific type.⁴ To help understand this, try the following code:

struct S<T> {
    func f(i: Int) { print("f(Int)") }
    func f(t: T) { print("f(T)") }

    func g(i: Int) { print("g(Int)") }

    // note, here scoping rules mean the placeholder 
    // T is a _different_ T to the struct's T
    func g<T: IntegerType>(t: T) { print("g(T)") }
}

// fix T to be an Int
let s = S<Int>()

// error: Ambiguous use of f
s.f(1)

// prints "g(Int)" not "g(T)"
// because generics lose...
s.g(1)

Incidentally, it’s stuff like the potentially confusing scoping of T above (or that placeholders might get randomly renamed) that makes me nervous about re-using a struct’s placeholders when extending that struct. It doesn’t feel right. Usually, nicely-written generic classes typealias their placeholders in some fashion. For example, Range typealiases T as Index. Probably best to use that.

Anyway, for Range, declaring a generic function ..., that doesn’t have to compete with any non-generic alternatives, makes these problems go away. Plus operators look nice.

Next up: what if we didn’t want Range to be the default? What could we do about it?

For more on Swift’s lazy types, see this article ↩
well, depending on your definition of “work” I guess. ↩
If you want to see how these are possible (if perhaps inadvisable) Nate Cook recently wrote a good article about them. ↩
If you really want to tie your brain in knots, try to think about a secenario where the struct placeholder is fixed by which init is chosen, which relies on what T is… ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

Month: November 2014

Running Totals and Force Unwraps

Which function does Swift call? Part 6: Tinkering with priorities

Which function does Swift call? Part 5: Range vs Interval