Working at being lazy

Posted on July 27, 2014July 29, 2014 by airspeedvelocity

As teased in the previous article, suppose you want a new kind of lazy evaluation. You want to be able to generate a lazy sequence that represents the first n values of a container.

Here’s how you could add it to LazyRandomAccess:

extension LazyRandomAccessCollection {
    func first_n
      (n: LazyRandomAccessCollection.IndexType.DistanceType)
      -> LazySequence<
           PermutationGenerator<
             LazyRandomAccessCollection,
             Range<LazyRandomAccessCollection.IndexType>
           >
         > {
        let start = self.startIndex
        let end = min(self.endIndex, start.advancedBy(n))
        let range = start..<end
        let perm = PermutationGenerator(elements: self, indices: range)
        return lazy(perm)
    }
}

This uses PermutationGenerator, which is a lazy view class that returns a (sub-)sequence of values from a collection in the order given by another collection of indices into that collection. In first_n’s case, the indices are a range of the first n indices (or the full collection if there are fewer than n elements).

The return type of first_n is a bit crackers so I’ve broken it up over multiple lines. It returns a lazy sequence of a permutation generator of a lazy random access collection, permuted by a range of lazy random access collection indices. Phew.¹ This is why the lazy function and Swift’s type inference is more than just nice to have, it’s essential. The actual types you are using in a simple function can quickly get to a point where there’s no practical way you could declare them by hand.

The practice of returning another lazy object that wraps the results is copied from the lazy map and filter members, which do the same. Why wrap PermutationGenerator in another LazySequence instead of returning it directly? Chaining mostly I think.² Without doing that, you’d have to re-lazy the result if you wanted to run more lazy filters on it:

let r = 1...10
// with lazy wrapper
let l = lazy(r).first_n(5).filter(isOdd)
// without lazy wrapper
let l = lazy(lazy(r).first_n(5)).filter(isOdd)

That works for a collection,³ but what about a sequence? How can we generate a new sequence that only returns the first n values?

Instead of declaring that in one shot, we’ll follow the pattern of map and filter and first declare a view:

struct FirstNSequenceView<Base: Sequence> : Sequence {
    private let _n: Int
    private let _base: Base
    init(_ base: Base, _ n: Int) {
        self._base = base
        self._n = n
    }

    func generate()
      -> GeneratorOf<Base.GeneratorType.Element> {
        var i: Int = 0
        var g = _base.generate()
        return GeneratorOf {
            ++i <= self._n ? g.next() : nil
        }
    }
}

This uses GeneratorOf, which takes for its constructor a closure that serves up values for each call to next(). This helps avoid having to manually write a companion generator class when you implement your own sequence (IndexingGenerator is a similar helper that can be used to conform to Sequence when you implement your own Collection). In this case, generate() creates a closure that captures a counter that will count up, returning elements from the sequence until it hits n. It then returns a GeneratorOf that will call that closure each time next() is called on it. Since each time generate() is called, a new closure with a fresh i will be created, this means FirstNSequenceView conforms to the requirement of Sequence that it can be walked multiple times with multiple independent generators.

Now that we have this view, it’s easy to declare first_n for all the lazy views:

extension LazySequence {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazySequence>> {
        return lazy(FirstNSequenceView(self,n))
    }
}
extension LazyForwardCollection {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyForwardCollection>> {
        return lazy(FirstNSequenceView(self,n))
    }
}
extension LazyBidirectionalCollection {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyBidirectionalCollection>> {
        return lazy(FirstNSequenceView(self,n))
    }
}
extension LazyRandomAccessCollection {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyRandomAccessCollection>> {
        return lazy(FirstNSequenceView(self,n))
    }
}

It’s a bit of a hassle having to implement each one separately, but that’s the price you pay for structs that aren’t in a hierarchy. All the more reason to factor the hard part out into a seperate view object.

Does first_n belong as a lazy view member? Unlike map and filter, it doesn’t take a closure so the laziness is less likely to bite you. Maybe take_while, a filter that returned members of an array until a supplied predicate returned false, would have been a better example. But first_n could take a lazy sequence that itself depended on a closure, so the laziness of first_n could still come as a surprise. Also, the chaining is nice. Does this mean other lazy sequences like enumerate should migrate into being lazy class members? I dunno, maybe.

Without the chaining, you have to wrap the results in a function call. But the function call goes on the left, while chained calls go on the right, which results in a confusing ping-pong effect:

let l = lazy(first_n(lazy(r).filter(isOdd),5)).map(...)

The ordering of this is not obvious at first glance, but it’s filter, then first_n, then map. But then where does it end? Should everything get piled into the lazy classes just to benefit from chaining? Nah. The better solution is to implement a pipe forward operator, of which more some other time.

I have a nagging feeling this could be simplified. The Swift standard lazy functions don’t look quite this bad, they use a type of S instead of a type of themselves for the inner parts of the return values of map and filter. But then the compiler doesn’t completely faithfully represent the generic types in the declarations you see in Xcode (which is why there is a lot of T == T in what we see of the Swift standard library). Let me know if can figure out a better way. ↩
Also, if someone option-clicks the results, it’s obvious at a glance that the results are lazy. ↩
In fact, while a variant of the permutation version would work for forward and bidirectional collections as well, it’s not a good idea, because advance will take O(n) which is unnecessary, as the next version will show. ↩

Changes in the Swift Standard Library in Beta 4

Posted on July 21, 2014July 21, 2014 by airspeedvelocity

Other than access control,¹ no big changes to the language with Beta 4, unlike with Beta 3. But still plenty of changes to the standard library.

Some small bits and pieces:

Float and Double are no longer random access indices. Who was planning on indexing a collection with a Double, anyway? Just how big was your collection going to be?²
uppercaseString and lowercaseString properties are gone from String. I never liked them. Call a function a function, that’s what I say.
insertionSort and quickSort are gone. Generic non-name brand sorts only from now on.
Instead of having static members for true and false, Bool now implements the new BooleanLiteralConvertible protocol
CString is gone. You can still construct strings from c-style strings, but those now take a ConstUnsafePointer. Probably to discourage people from keeping the nasty little things hanging around in their native primordial form.
remainderWithOverflow replaces modulusWithOverflow
Yet more types are reflectable, and there’s now an enum value to represent Objective-C objects separate from Swift ones.

Now the bigger changes…

ArrayBuffer, ArrayBufferType, SliceBuffer and ContiguousArrayBuffer are gone. It was kinda surprising they were there in the first place, revealing the inner workings of Array. Sorry if you spent some time figuring out how to use them, but you probably had no business poking around in there. As a result, the makeUniqe member functions are also gone.

reverse is no longer lazy. It now gives you back an Array. Huh. Looks like if you really want to reverse something lazily, you now need to create a LazyBidirectionalCollection via a call to lazy(), then call .reverse() on it to get another lazy collection. Bit cumbersome, but probably more explicit.

There’s also LazyForwardCollection, LazyRandomAccessCollection, and LazySequence. lazy() will return the appropriate one from your base collection or sequence. All of them give a base the ability to map and filter, plus conversion to an array. Only the bidirectional and random ones can be reversed, obvs. And the random access one has a subscript. I guess I need to update my article on lazy collections and sequences.

Range is no longer sliceable. I don’t really “get” sliceable, but what I do get suggests Range shouldn’t have been sliceable.

Speaking of ranges, were you trying to use Range with floating-point types? Well cut that out and use StrideTo and StrideThrough instead, both created with a call to stride (with either a to or a through as the second argument name). They mirror respectively a half-open and closed version of Range. The big difference being if your stride length takes you past the end, they stop striding.

The version of filter that took a collection not a sequence is gone. It was a bit pointless anyway, since you can’t index randomly into a filtered collection without doing all the filtering up to that point. And the sequence version is no longer lazy, it returns an Array. If you want to filter lazily, use the explicitly lazy stuff above. If you want to filter starting from the back, you can do lazy().reverse().filter()

As ever, I refrain from commenting on anything relating to Unicode and Objective-C bridging, for fear of demonstrating my ignorance.

About which I appear to be out of the mainstream in not being desperate for. I mean it’s important before go-live, sure. But some people seem to need a cold shower after hearing it’s in this beta. ↩
Or how small! No, wait… ↩

Jumping to conclusions

Posted on July 19, 2014 by airspeedvelocity

In an article yesterday, I mentioned that I wasn’t having much luck extending Dictionary to give it an initializer that took a sequence of tuples, because my attempt wasn’t compiling, even though I was sure that it should:

extension Dictionary {
    init<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        self.init()
        for (k,v) in seq {
          self[k] = v
        }
     }
}

Looked like the problem was with destructuring a typealiased tuple type into its consitutents. Like a good little beta tester, I settled in this morning with a fresh pot of coffee to file a bug report. My plan was to reproduce the problem without the Dictionary part, to try and clarify exactly where the issue lay. I tried writing a couple of classes that used typealiased tuples, one taking two generic placeholders and typealiasing the pair of them as a tuple, and then writing a function on one that took the other.

After about 20 minutes plugging away at this, I was still unable to reproduce it independently, and was geting quite frustrated. Then I realized hang on, what made me so sure this was a problem with tuples?¹ I pushed aside my failed attempt and just tried compiling this, and got the exact same error:

extension Array {
    func someFunc<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        for e in seq {

        }
     }
}

Well, shit. That’ll teach me to overthink a problem. I scold myself for not applying Occam’s razor and assuming the cause was more complicated than it was.² Once I stopped obsessing over the tuple part, the next step in fixing my code became obvious. for ... in is just shorthand for creating a generator and then looping while next() returns non-nil elements. So I tried this:

extension Array {
    func someFunc<S: Sequence
      where S.GeneratorType.Element == Element>
      (seq: S) {
        var gen = seq.generate()
        let elem = gen.next()
    }
}

Bingo! The call to gen.next() reports Cannot convert the expression's type 'Self.Element? to type 'Self.Element?'. Somehow type inference is borked. But if I qualify the declaration of elem as elem: Element? it works fine.

So now I know how to work around it, I can finally write my sequence initializer for dictionary.

But first: @rbrockerhoff replied yesterday with a gist that was very close to the Sequence initializer (it took MapCollectionView as an argument rather than the sequence directly, which somehow avoids the type inference issue in a similar same way that my version that took an Array did).

As part of this, he did something very smart, which is to separate out inserting the sequence into the dictionary into another method, and then calling it from init instead of having the loop in init itself. This is a great example of solving something generically first, and then specializing your generic solution to the immediate problem. So here’s the full solution:

extension Dictionary {
    init<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        self.init()
        self.merge(seq)
    }

    mutating func merge<S: Sequence
      where S.GeneratorType.Element == Element>
      (seq: S) {
        var gen = seq.generate()
        while let (k: KeyType, v: ValueType) = gen.next() {
            self[k] = v
        }
    }
}

let pairs = enumerate(["zero","one", "two"])
let d: [Int:String] = Dictionary(pairs)
d[1] // returns {Some "one"}... yes!

Just one wrinkle – you still have to give the dictionary variable a type, it won’t infer it. I’m not sure if this is a bug in my code, a Swift bug, or a feature not a bug. If you know, let me know.

I was possibly nudged towards the tuple assumption because of another bug in playgrounds with mapping arrays of tuples. ↩
I did the same thing yesterday when I convinced myself the only way to write my own Dictionary.init() was to create another dictionary and assign it to self, before a helpful comment from @NachoSoto suggested I just try self.init(). Duh. ↩

On dictionaries and initializers, in which I give up and use a for loop

Posted on July 18, 2014July 20, 2014 by airspeedvelocity

On twitter, @rob_rix confessed that he’d given up and used a for loop. @rbrockerhoff replied he had too, in his case because you can’t create a dictionary from an array.

That tweet sent me deep down the rabbit hole experimenting with Dictionary. I keep wanting to fix it, like a scruffy boyfriend. But like with said boyfriend, that’s easier said than done.

Warning: some of the below is very experimental, thoroughly likely to crash your playground, and quite possibly wrongheaded in lots of places. If you spot any of the latter, do tweet me a correction @airspeedswift.

The root of the problem is that, unlike Array, Dictionary doesn’t have an initializer that takes a Sequence of key/value tuples. If it did, you could do cool stuff like this:

let words = ["zero", "one","two"]
let enumerated = enumerate(words)
let number_to_word = Dictionary(enumerated)
// results in [0:"zero", 1:"one", 2:"two"]

let d = [0:"zero", 1:"one", 2:"two"]
let flipped = Dictionary(Zip2(d.values, d.keys))
// results in ["zero":0, "one":1, "two":2]

Now, if you root around in the standard library¹ you’ll find it does have this function, which it implements as part of the DictionaryLiteralConvertible protocol:

static func convertFromDictionaryLiteral
  (elements: (KeyType, ValueType)...) 
  -> [KeyType : ValueType]

This is tantalizingly close to what we want, at least for arrays if not sequences. It takes a variadic argument of tuples, which will in the function itself be represented as an array of tuples. Hey, we want to create a dictionary from an array of tuples! Can we use this?

Nope. Try as you might, you can’t shove an array through that function. If you managed to get one in, inside the function it would appear as an array inside an array. Languages like Ruby and Python have the ability to “splat” an array into a function as its parameters, but (as far as I can tell) there’s no way to do that in Swift.

Sigh. I guess there’s no alternative but to use a loop. But to make myself feel better about it, I’ll fix something else that bugs me about Dictionary. As we saw, convertFromDictionaryLiteral takes an array of (Key, Value) tuples. But the one function in Dictionary that also takes a key and a value takes them the other way around:

func updateValue(value: ValueType, forKey key: KeyType) -> ValueType?

You know how I said you can’t pass an array into a variadic argument in Swift? Well, you can pass a tuple into a function that has the same number of arguments as the tuple:

func takeTwoArgs(arg1: Int, arg2: String) {
//  do stuff with arg1 and arg2
}

let pair = (1, "one")
takeTwoArgs(pair)

To take an array of tuples and insert them into a dictionary would be a lot nicer if you could just pass the elements of the array in directly, rather than having to flip them around. This will do it:

extension Dictionary {
    // offensive vague naming of a function alert...
    mutating func update
      (key: KeyType, _ val: ValueType) 
      -> ValueType? {
        return self.updateValue(val, forKey: key)
    }
}

let pairs = [(0,"zero"), (1,"one"), (2,"two")]
var d: [Int:String] = [:]
for pair in pairs {
  d.update(pair)
}

OK, doing that makes me feel a bit more cheerful, but I’d still like to be able to construct a dictionary from scratch using an initializer. This isn’t just a whim, by the way – having to do the for loop above forces you to declare d with var, whereas an initializer could be used with let.

So how about we extend Dictionary with an initializer:

extension Dictionary {
  init(_ array: [Element]) {
    for pair in array {
      self.update(pair)
    }
  }
}

Oops, a mysterious compiler error. “Variable 'self._variantStorage' used before being initialized“. We’ve accidentally been exposed to the inner implementation of Dictionary. And unlike Array, which has some hints at how you might manipulate the underlying store with structs like ArrayBuffer, Dictionary is entirely enigmatic. We could try and figure out how to initialize _variantStorage by hand but that’s just going to lead to tears.

Or, we could just create another Dictionary, populate it, and then assign it to self right at the end of the initializer:

edit: Ignacio in the comments points out the (now blindingly obvious) alternative of just calling self.init() instead – thanks! But I leave the self-assignment here as something of interest that works.

edit edit: Maybe not so fast. self.init() works in playground, but not in compiled code right now.

extension Dictionary {
    init(_ array: [Element]) {
        var d: [KeyType:ValueType] = [:]
        for pair in array {
            d.update(pair)
        }
        self = d
    }
}

I have no idea if this is supposed to be allowed or not. I mean, it seems to work fine, but I think I might have voided the warranty. Certainly I’m going to think twice about filing this code as part of a radar.

Nevertheless, we can now do this:

let pairs = [(0,"zero"), (1,"one"), (2,"two")]
let d = Dictionary(pairs) // woo-hoo
// d now contains [0:"zero", 1:"one", 2:"two"]

For my last trick, I would love to say I’ve generalized this to work for sequences, but unfortunately I still can’t figure that one out. This seems like it ought to be the solution:

extension Dictionary {
    init<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        var d = [KeyType:ValueType] = [:]
        for pair in seq {
            d.update(pair)
        }
        self = d
    }
}

but it segfaults the compiler (I need to find a way to reproduce this without the dodgy self-assignment, so I can file a radar with less silly code…) (edit: well, now I know a way! Was hoping that would fix the segfault, but no)

As a workaround, I’ve tried creating an array with the sequence and then using the previous initializer:

extension Dictionary {
    init<S: Sequence
        where S.GeneratorType.Element == Element>
        (_ seq: S) {
            var d = Dictionary(Array(seq))
            self = d
    }
}

However, this leads to the compiler compliaining that “'WhateverYourSequenceTypeIs' is not convertible to '[(KeyType, ValueType)]'“. This suggests it’s not matching the sequence initializer. If anyone can spot what’s wrong here and has a fix, let me know. For now, just stick an Array() around your init arguments and pretend you’re in for-loop free nirvana.

edit: found it! Read the next post for a solution.

In case you don’t know how to do that, type import Swift into a playground then command-click on the word Swift. ↩

A better way to hit bottom

Posted on July 14, 2014July 15, 2014 by airspeedvelocity

After my previous post on finding minimums, a couple of people pointed out that reduce would be much better suited to the task. They’re absolutely right, but I ignored that for the last post since I had another agenda. Plus, the combination of min and reduce is an interesting one with its own wrinkles, that deserves a post of its own, so here it is.

reduce takes a sequence, a starting value, and a combining function, and uses that function to boil the sequence down to a single value. For example, to total up the values in an array:

let a = [2, 4, 6, 8]
// starting with 0, add each value
// of 'a' to a running total
let sum = reduce(a, 0, +)
// sum = 20

That certainly sounds like what we want for finding the minimum value in a sequence, with min as the combining function (as recently made possible in beta 3):

let a = [5, 4, 3, 8]
// but what to put in the starting value?
let the_min = reduce(a, ???, min)

We need to pick a starting value to use on the first call to min. Unlike with addition, there’s no natural constant to start a search for a minmum value. You could start with the maximum value for an Int, but that smells like a hack.

Using the first element of the sequence would do it:

var gen = a.generate()
if let first = gen.next() {
    reduce(a, first, min)
}

But this means the first thing this code does every time is redundantly compare the first element to itself. Strictly speaking, that’s fine from a functional behaviour point of view, and even the performance consequences amortize to negligible, but it’s not very satisfying.

This version avoids that problem:

var gen = a.generate()
if let first = gen.next() {
    reduce(gen, first, min)
}

Why does that work? Because Array.GeneratorType happens to be both a Generator and a Sequence. In fact, in a quick scan through the Swift standard library, I couldn’t spot any generator that wasn’t also a sequence. But just because they always are doesn’t mean they have to be, which is why implementing a generic function tail that returns all but the first element of a sequence is surprisingly tricky – but let’s leave that for a later post.

So, armed with our ability to use reduce to find a minimum, let’s implement our solution from the previous post that finds the minimum numeric value in an array of strings:

let string_ints = ["dog", "12", "cat", "4", "200", "elephant"]

var gen = string_ints.generate()
if let first = gen.next() {
    reduce(gen, first) { (lhs, rhs) in
        switch (lhs.toInt(), rhs.toInt()) {
        case (_, nil): return lhs
        case (nil, _): return rhs
        case (let i, let j): return i < j ? lhs : rhs
        }
    }
}

OK that works, but this is pretty ugly with the generator on top of that convoluted closure. And remember last time I mentioned there were still a couple of issues – if string_ints contains no integer strings, it’ll just return the first non-integer string. And it does every integer conversion twice.

Do we still need the generator? Our closure that chooses between sequence elements now discards nil values as soon as it finds a non-nil one. That means that we could supply nil as our initial argument. This also means a returned value of nil could stand in for “not found”, either because the sequence was empty, or because it contained no strings convertible to numbers.

But before we make that change, let’s neaten things up by pulling out the code that does the nil discrimination into its own function.

Here is combineMaybe, a function that takes a function that combines two values (like min), and returns a new function that chooses between two optionals, always preferring the non-nil one before resorting to the supplied function if neither value is nil:¹ ²

func combineMaybe<T>
  ( combine: (T,T)->T ) 
  -> (T?,T?)->T? {
    return { lhs, rhs in
        switch(lhs, rhs) {
        case (_, nil): return lhs
        case (nil, _): return rhs
        default: return combine(lhs!,rhs!)
        }
    }
}

combineMaybe is a function decorator – a function that takes another function, and enhances it in some useful way, returning the enhanced function.

We can now use the standard library version of min, enhanced with our decorator, as the combiner function to reduce, supplying nil as the starting value. An initial call to map to first convert the strings to integers generates the input sequence:

let string_ints = ["dog", "12", "cat", "4", "200", "elephant"]
let maybe_ints = map(string_ints) { $0.toInt() }

if let the_min = reduce( maybe_ints, nil, combineMaybe(min)) {
    // the_min = 4
}
else {
    // handle both empty input or no numerics in input
}

This fixes the bug of returning a non-numeric string when that’s all the sequence contains. And it only converts each number once, in the call to map.³ ⁴

If instead we wanted to pull out the maximum values, all that is needed is to replace the passing of min into combineMaybe with max, with everything else remaining the same.

In fact, it’ll even work with other binary functions as well. If you replaced min with +, it’ll add all the numeric values in the array, while still giving you a chance to catch if no numbers were found in the sequence (sometimes zero is still not the right answer for an empty sequence, even one containing integers rather than strings).

The “maybe”-based name comes from Haskell, where maybe is a function that takes a default value, a function, and an optional type (which are called Maybe types). ↩
As ever, really wishing switch was an expression that returned the value of its cases when they were single expressions. ↩
Ok, so this doesn’t fulfil the original requirement of returning a string rather than an integer (from the previous article). I think we can keep that as a subsequent step, I was mostly being antagonistic about that part. ↩
Haskell also has a function, mapMaybe, that is like a map to an optional value, followed by a filter that removes all the resulting nil values. That would be a good building block for solving this problem too, though it still wouldn’t help with the starting value for reduce. ↩

Swift Structs and accessing properties by name

Posted on July 13, 2014July 13, 2014 by airspeedvelocity

Brent Simmons posted a blog article today about a feature he was missing from Swift that was readily available in Objective-C – the ability to access properties of an object indirectly using a string.

Brent’s goal is to be able to write some generic code that compares various different properties. Here’s how he puts it:¹

For every merge-able property there are really two properties, and there’s a simple naming convention.

For example, a note’s archived flag has two properties: archived and archivedModificationDate.

Merging works like this pseudo-code:

if existingObject.archivedModificationDate < serverObject.archivedModificationDate {
existingObject.archived = serverObject.archived
existingObject.archivedModificationDate = serverObject.archivedModificationDate
}

But the code doesn’t actually look like that. Instead of duplicating the above for each property, there’s a single method that takes server object, existing object, and property name as parameters. Then it uses valueForKey: on both server object and existing object. (It gets the key for the date value by adding “ModificationDate” to the property name.)

This turns merging a given property into a one-liner:

mergeProperty(existingObject, serverObject, “archived”)

He asks if there’s a way to do this right now. The answer is almost, with some shortcomings (most of which will hopefully get resolved before the end of the beta).

Firstly, here’s an example version of a struct with some properties:

(I’m using Int as a stand-in for a date/time object, for simplicity)

struct Properties {
    // someday soon these can be private
    var _name: String, _nameModificationDate: Int
    var _shape: String, _shapeModificationDate: Int

    init(name: String, nameModificationDate: Int,
         shape: String, shapeModificationDate: Int) {
            self._name = name
            self._nameModificationDate = nameModificationDate
            self._shape = shape
            self._shapeModificationDate = shapeModificationDate
    }

    func name()->String { return _name }
    func nameModificationDate()->Int { return _nameModificationDate }

    func shape()->String { return _shape }
    func shapeModificationDate()->Int { return _shapeModificationDate }
}

Note that the “getter” methods for name and shape are methods, not var properties with a { get }. This will be important later.

This struct is both immutable in the sense that there are no setter methods, and also can be immutable because it’s a struct you can declare with let instead of var. There’s an important distinction here – classes can be immutable too, in the former sense. A pattern in the Swift standard library is to declare an immutable version of a protocol that only has read methods (such as Collection, that declares an rvalue version of subscript), and then an inherited version that is mutable and has write methods (such as MutableCollection, that declares an lvalue version of subscript).

Of course, right now there’s no private variables in Swift, so internal variables are still writeable, but Apple have made it pretty clear this is coming in a later beta.

Having declared our struct, we create a dictionary that can be used to access the getter methods:

let propertyDict = [
    "Name":  (Properties.name, Properties.nameModificationDate),
    "Shape": (Properties.shape, Properties.shapeModificationDate),
]

This dictionary maps a string name onto a tuple containing two member function references,² one for the property and one for its associated modification date. They have a type of Properties->()->String (or ->Int ). You use them first by applying an instance of the object, which returns a new function bound to that instance, and then calling the function.

Here is why the accessors had to be methods. To my knowledge, you can’t do the same thing with var propertyName { get } properties, for the simple reason they don’t need () to invoke, so you can’t leave it off to get the method reference. Maybe under the hood there’s a secret get_propertyName() function you could use, but I can’t find it.

To demonstrate these in use, here’s the implementation of the mergeProperty function:

func mergeProperty(existingObject: Properties,
    serverObject: Properties,
    propertyName: String) 
    -> (String, Int) {

        // fetch the property-accessing member function references
        // associated with propertyName
        if let (property, propModDate) = propertyDict[propertyName] {

            // now get the modification date for this property
            // for both the existing and server versions
            let existingModDate = propModDate(existingObject)()
            let serverModDate = propModDate(serverObject)()

            // do your date comparison to pick the winner
            if existingModDate >= serverModDate {
                return (property(existingObject)(), existingModDate)
            }
            else {
                return (property(serverObject)(), serverModDate)
            }
        }
        else {
          // throw an exception! (only kidding, in practice mergeProperty's 
          // return type should probably be optional and return nil to 
          // account for this possibility
          return ("",0)
        }
}

And here is the mergeProperty function in use:

let existingObj = Properties(name: "Fred", nameModificationDate: 10,
    shape: "Square", shapeModificationDate: 20)

let serverObj = Properties(name: "Bob", nameModificationDate: 12,
    shape: "Circle", shapeModificationDate: 18)


let (mergedName, _) = mergeProperty(existingObj, serverObj, "Name")
let (mergedShape, _) = mergeProperty(existingObj, serverObj, "Shape")

// mergedName = Bob, and mergedShape = Square

There are a lot of likely objections to the above code, some of which can be addressed by more code, some by possibly-upcoming Swift enhancements, and some that are just fundamental to Swift.

The hardcoding of the entries in propertyDict. This is the inevitable result of there being no official reflection in Swift. Yet. There’s clearly the beginnings of reflection with the getMirror() method and AnyClass type. I’d be surprised if Swift made it all the way through beta without it being added. Once there’s reflection, you could build up propertyDict at runtime (and also hopefully compile time if you prefer) based on a list of property names and a naming convention.

Jason Peebles tweeted an alternative, that involves access to the raw member variables using the nascent reflection capabilities. This is very interesting, and kind of like solving this from the other end. My concerns with this approach is it feels better to me to use methods, though maybe that’s my own prejudice. Also, that current reflection API is possibly temporary and only there to facilitate the playground and debugger. Eventually when full reflection is implemented, hopefully both methods and member variables will be exposed (maybe they are already with a bit more playground spelunking).

It only supports strings. In real-world use, you would wrap up propertyDict and mergeProperty together into a class (along with the reflection that took the metaclass and a list of property names), which you could then make generic to operate on Int or String accessors depending on need.

Use of methods instead of properties. I don’t know how to fix this right now. Leave a comment if you do! But for the most part, anything you can do with properties you can also do with getting and setting methods, so it’s not a deal-breaker.³ This is the biggest sticking point though.

It’s not how it works in Objective-C. More of a philosophical difference, this one. But you could probably reproduce nearly all the behaviour of valueForKey using the above technique. The solution using generics is still going to leave you fighting the type system a bit more than you might in Objective-C, and while using Any instead might get you closer, I usually find myself fighting even harder with Any than I did with types.

An implementation that is true to the valueForKey behaviour, by which any property can be fetched with a single method call, seems by definition to require a return of an Any variable which gives up much of the type safety Swift is forcing you towards. Maybe whatever reflection solution eventually emerges will show a pleasing middle-ground.

Anyway, hopefully this solves at least part of Brent’s problem. Maybe some day some of this code will make it into Vesper, which would be very exciting as I’m a big fan of the app.

Sorry about butchering the code part. My blog engine and I are not getting along. ↩
I made that term for them up, by the way. It’s not like an official name for them or anything. It’s what they’re called in C++, which Swift seems to model in this respect. ↩
At least until Apple implements KVO in Swift only for properties. Then this technique is screwed. ↩

Always write a generalized version of your algorithm

Posted on July 12, 2014July 14, 2014 by airspeedvelocity

Suppose you want to find the minimum value in an array. Well, the Swift standard library has you covered:

let a = [6, 2, 4, 7, 8, 10]
minElement(a)  // returns 2

Ok that’s cool. But suppose you have an array of strings of integers and you want the minimum numeric value.

Well, I guess you could map them to Int and then use the same function:

let a = ["6", "2", "4", "7", "8", "10"]
minElement( a.map { $0.toInt() })

Nope, compiler error. String.toInt returns an Int?, because maybe the string isn’t a number.

Fair enough. I happen to know with my intimate knowledge of the problem domain that we can just ignore non-numeric values. So maybe filter those out?

minElement( a.map { $0.toInt() }.filter { $0 != nil } )  // hmmm

Nope, more compiler errors, because now we have only non-nil values, but they’re still optionals and an Int? doesn’t conform to Comparable. So maybe another map to force-unwrap the results (force unwrap is OK because the filter will guarantee no non-nil elements):¹

minElement( a.map{ $0.toInt() }.filter { $0 != nil }.map { $0! } )  // yuck

Ok that compiles now and does return the minimum numeric value. But what if we wanted the result to still be a string?

"\(minElement( a.map{ $0.toInt() }.filter { $0 != nil }.map { $0! } ))" // bleurgh

That’s pretty horrible, and at this point I’m thinking of registering GiveUpAndUseAforLoop.com,² and Marco Arment is yelling at you kids with your new language to get off his lawn.

Really, what we need is for minElement to take a predicate instead of only using the default <. It doesn't (mutter, grumble) but if it did, we could write this:

minElement(a) {
    switch ($0.toInt(), $1.toInt()) {
    case (_, nil): return true
    case (nil, _): return false
    case (let i, let j): return i < j
    }
}

Ahh. Faith in the functional programming style restored.³ This takes each element, checks if the toInt() of either is nil using pattern matching, always preferring the non-nil version, and if neither are, compares them. Since all it's doing is comparing the converted value, not returning one, the minimum returned is a string.⁴

There's nothing stopping you from implementing your own overload of Apple's minElement that takes a predicate:⁵

func minElement
  <E, R: Sequence
  where E == R.GeneratorType.Element>
  (range: R, pred: (E,E)->Bool) -> E {

    var gen = range.generate()
    var min_so_far = gen.next()!

    while let next = gen.next() {
        if(pred(next,min_so_far)) {
            min_so_far = next
        }
    }

    return min_so_far
}

This version copies the Swift library version's questionable practice of exploding in flames if you pass it an empty sequence – the (better?) alternative being make the return value optional to force the caller to acknowledge this possibility.

The moral here is that if you ever write a high-order function, write it as generally as possible. It's only a few extra lines to declare the common case using the general case:⁶

func minElement
  <E: Comparable, R: Sequence
  where E == R.GeneratorType.Element>
  (range: R) -> E {
    return myMinElement(range, <)
}

Apple does this for most functions (such as sort), just not minElement. But then it is only beta 3, hang in there.

“I thought it was guaranteed not to be nil at this point” is what you could write to the log file in the outer exception handler you can’t have. ↩
Just kidding, I didn’t think about it, I registered it straight away. ↩
This code would be even neater if the Swift switch statement would implicitly return the values of single-line cases, i.e. if all the returns in that closure were removed, the switch would evaluate to the value of the selected case statement, and the closure would return it. They closed my radar as a dupe of 16169366 (don't file another duplicate though). ↩
It's still not perfect – if none of the elements are numeric, it'll return the first one. Also, it'll convert each element twice. Hold that thought for a later post though. ↩
Actually, there might be some compiler bugs stopping you, but I'll leave you to discover those on your own. ↩
If the language would let you, you could even default the last parameter to <, but it won't, so you can't. More dynamic default arguments (including ones derived from other arguments to the same function) could be pretty useful in cutting down the amount of code you need to write for these general- and special-case versions. ↩

Swift’s Lazy Collections and Sequences

Posted on July 10, 2014March 9, 2015 by airspeedvelocity

Warning: this article is out of date as of Swift 1.0 beta 4. Read this updated version instead.

The Swift non-member function map acts much like the Array.map member function, except you can pass in any kind of sequence or collection (click here for an article on how to write an algorithm like this).

But if you look at what map returns, it isn’t an array of results. It’s an object (a MapColectionView or a MapSequenceView).

That’s because map evaluates its results lazily. When you call map, no mapping takes place. Instead, the input and mapping function are stored for later, and values are only mapped as and when they are needed.

To see this happening, run the following code:

let r = 1...3
let mapped = map(r) {
    (i: Int) -> Int in
    println("mapping (i)")
    return i*2
}

println("crickets...")

for i in mapped {
    println("mapped (i)")
}

println("tumbleweed...")

println("index 2 = (mapped[2])")

// prints out
crickets...
mapping 1
mapped 2
mapping 2
mapped 4
mapping 3
mapped 6
tumbleweed...
mapping 2
index 2 = 4

The returned object from map mimics the properties from the object passed in. If you pass in a sequence, then you get back a sequence. If the collection you passed in only supported a bidirectional index (such as a Dictionary or a String, which can’t be indexed randomly), then you won’t be able to subscript arbitrarily into the mapped collection, you’ll have to walk through it.

map is not the only lazy function. filter and reverse also return these “view” classes. Combining these different lazily evaluated classes means you can build up quite complicated expressions without sacrificing efficiency.

Want to search backwards through an array or string? find(reverse(myarray), value) won't actually reverse the array to search through it, it just iterates over the ReverseView class, which serves up the array in reverse order.

Want the first 3 odd numbers that are a minimum of two arrays at each position? first_n(filter(map(Zip2(a1, a2), min) {$0%2 == 1}, 3) wont do any more zipping or minning or modding than necessary.¹²³

There are also several lazily generated sequences you can also use:

enumerate numbers each item in a collection — it returns an EnumerateGenerator, that serves up those numbered elements on demand. This is useful in a for (count, val) in enumerate(seq) loop where you want to know both the value and how many values so far.

PermutationGenerator is initialized with a collection and a collection of indices into that collection, and serves up each element at each index in that order (thus allowing you to lazily reorder any collection – for example, PermutationGenerator(elements: a, indices: reverse(a.startIndex..<a.endIndex)) would be the reverse of a.⁴

GeneratorOf just takes a closure and runs it each time to generate the next value. So var i = 0;var naturals = GeneratorOf({ ++i }) sets naturals to a sequence that counts up from 1 to infinity (or rather, to overflow).

These infinite virtual sequences can be pretty useful. For example, to do the same thing as enumerate, you could write Zip2(naturals, somecollection). Or you could pass your infinite sequence into map or filter to generate further inifinite sequences.

(at the opposite end, if you have a single value, but what a function needs is a Sequence, you can use GeneratorOfOne(value) to create a quick sequence that will just serve up just your one value)

It's not all rainbows and unicorn giggles though. Imagine you did the following:

let r = 1...10_000
mapped = map(r) { megaExpensiveOp($0) }

let a1 = mapped
let a2 = mapped

Here the lazy evaluation will work against you. megaExpensiveOp will be run not ten but twenty thousand times.

“Shouldn't map cache the data?” you ask. Well that leads to the next complication. Take this code:

var i =0
let mapped = map(1...5) {
    i += 1
    return $0 + i
}

Every time you iterate over mapped now, you'll get different results.⁵

This behaviour might be put to very good use (say you wanted to peturb a data set with small random values). But if you weren't expecting it, that could be one nasty bug to track down.

So its good to know when you're using these functions what they are actually doing under the hood. Just keep in mind these caveats and an exciting life of lazy evaluation awaits you in the off-world colonies.

OK I cheated, first_n isn't a Swift standard library function. But it should be! ↩
This code would look a lot nicer if Swift came with a pipe operator ↩
Yeah, this is a pretty rubbish example, I should have put the time in to think of a better real-life use case. ↩
Obviously, reverse(a) would work better, but a more useful example is tough to fit on one line… ↩
If you're unclear on how come, try reading this article. ↩

Changes to the Swift Standard Library in Beta 3

Posted on July 8, 2014July 9, 2014 by airspeedvelocity

Xcode Beta 3 came out today, and with it a new version of the Swift standard library.

By far the majority of changes are to account for the publicized updates to the language:

replacing Type[] with [Type], and making use of the new [Key:Val] sugar for dictionaries
switching uncheckedArithOp for arithOpWithOverflow in all the numeric types
adding explicit bigEndian and littleEndian initializers to the integers
replacing succ and pred with successor and predecessor¹
adding sorted as an operation on arrays and as a standalone algo, since sort now sorts in-place²

copy and unshare have been removed from Array, since they’re redundant now arrays have full value semantics. Assigning an array to another variable copies it. Or rather, it lazily copies it – the array will only by physically copied when you alter one of the elements. So really, the equivalent of unshare is called automatically when any value in the copied array or the copy is altered. You can still detect when this happens with the === operator, which returns true if two arrays are pointing to the same underlying store:

var a = [1, 2, 3, 4]
let b = a  // 'a' is "copied"
// but really, it's still the same
a === b    // returns true
a[2] = 2   // until you alter 'a'
// and now they're different
a === b    // returns false

This is really important to understand, in case you start to panic about all the unnecessary copying that might happen when you pass an array into a function as a parameter.

There are also a handful of other changes/additions that aren’t mentioned in the release notes.

Range now supports map as a member, so you can write (1...5).map { $0 * 2 } much like with an Array, rather than map(1...5) { $0 * 2 }. Note, Range.map returns an Array, unlike the map algorithm, which returns a lazily-computed view class.

Lots more types (such as Range) now have a getMirror() method.

min and max, which previously took two or more parameters (the “more” part being a variadic parameter) now have an overload with just two parameters. This means you can pass them as an argument to another function that expects a binary operator. So the following now compiles where previously it would throw an error:³

let a = [5, 3, 1, 7, 9]
let b = [2, 4, 1, 1, 7]

// now you can write this:
let mins = map(Zip2(a, b), min)

// instead of being forced to write this:
let mins = map(Zip2(a, b)) { min($0, $1) }

Finally, the already mysterious insertionSort got a bit more mysterious.

Previously there were two versions, both of which required a collection with a bidirectional index. One just required that the collection contents be comparable. The other allowed the caller to supply a predicate to compare elements with. That’s a pretty standard pattern, but oddly that predicate is declared inout. Why a sort algorithm needs to be able to change the caller’s comparison function I do not know, but it can!⁴

The new insertionSort that doesn’t take a predicate now requires a random-access index, rather than a bidirectional index. The version that takes the predicate still only requires a bidirectional index. It’s predicate is still inout though.

There are a few other changes, relating to unicode character processing and Objective-C interfacing, that I either don’t feel qualified to comment on or haven’t delved into. Feel free to mention antyhing interesting about them in the comments.

Bit odd, that, I preferred the shorter versions. ↩
I really hate they did this to the non-member version, as it means you can’t chain it. ↩
If you don’t follow what this does – Zip2 takes two sequences and “zips” them together, producing a new sequence that has pairs of elements from the same point in both sequences. map then takes each element of this sequence and applies a function to each one. That function, min, takes the pairs and returns the minimum. So the whole expression returns a sequence of the minimum of each position in either array. ↩
It does mean you can’t call it with a trailing closure expression. You have to create a variable and point that at a function and then pass that variable in. That can’t be the reason, can it? ↩

Extending the Swift language is cool, but be aware of what you are using

Posted on June 28, 2014July 11, 2014 by airspeedvelocity

Features like the @auto_closure keyword in Swift offer a great opportunity to extend the language in ways that might not be possible in other languages. For example, in this article we used it to implement the Ruby ||= operator.

Swift doesn’t yet have a native implementation of unless or until. These are just versions of if or while, but for when the predicate is not true. Obviously you could just stick a ! in front of your if clause, but some programmers prefer the readability of the opposing versions. The implementation of ||= could be rewritten as:

@assignment func ||=<T>(inout lhs: T?, rhs: @auto_closure () -> T) {
    unless(lhs) {
        lhs = rhs()
    }
}

I know, you think, I can implement unless and until myself! What’s nice is you can make them look almost like the native flow control statements, because closure expressions can be outside the parentheses of function calls if they are the last argument.¹ Implementing unless seems pretty straightforward:

func unless<L: LogicValue>(pred: L, block: ()->()) {
    if !pred {
        block()
    }
}

// doSomething only if condition is false
unless(condition) {
      doSomething()
}

until is a little trickier, because you need to execute the predicate multiple times as you loop. But @auto_closure can help you out.

func until<L: LogicValue>(pred: @auto_closure ()->L, block: ()->()) {
    while !pred() {
        block()
    }
}

// doSomething until condition becomes true
until(condition) {
    doSomething()
}

Now, the conditional expression passed as the first parameter to until will be automatically wrapped up into a closure expression and can be called each time around the loop. Any variable used in both the parentheses and the block will be captured by both, so altering it in the block will affect the result of the condition (hence the condition can become true over time).

Armed with your new unless and until functions, you write a function to search a collection for the index to the first occurrence that doesn’t match a predicate:²

func findIfNot
  <C: Collection, L: LogicValue>
  (col: C, pred: (C.GeneratorType.Element) -> L)
  -> C.IndexType? {

    var idx = col.startIndex
    until(idx == col.endIndex) {
        unless(pred(col[idx])) { return idx }
        idx = idx.succ()
    }
    return nil
}

let a = [1, 2, 3]
let isOdd = { $0%2 == 1}
// find the first non-odd number
let idx = findIfNot(a, isOdd)

Arguments about SESE and other stylistic preferences aside, this function should do the job. Except instead it generates a compiler error, 'C.IndexType' is not convertible to '()'.

Why? Because the return after the unless is a return from that closure expression between the braces, not a return from the findIfNot function. That unless closure expects no value to be returned, so when you return idx it’s an error. But if you were just blithely using this unless function you found in a library as if it were just like an if statement, you might not realize this and the compiler error may come as a shock.

Now say you instead implemented findIfNot to take in an index into the collection as an inout, and advanced that index to the first non-match, instead of returning a value (and returning the endIndex if every item matches). This is pretty bad code, but here it is:

func findIfNot
  <C: Collection, L: LogicValue>
  (col: C, inout idx: C.IndexType,
  pred: (C.GeneratorType.Element) -> L) {

    until(idx == col.endIndex) {
        unless(pred(col[idx])) { return }
        idx = idx.succ()
    }
}

let a = [1, 2, 3]
let isOdd = { $0%2 == 1}
var idx = a.startIndex 
findIfNot(a, &idx, isOdd)
// idx will not be what you would hope

Now, no compiler error. Instead, this code will fail much more subtly, always returning as if no non-match was found. If we replace the unless with an if, the code will run forever because the until block will return before moving idx forward, and then loop again.

Unit tests should catch this of course. But as a user of built-in-like functions, you should always be aware they aren’t quite the part of the language they might seem. The bugs resulting might not be as obvious as in this case. Also, this is a good argument for using a more “functional“ approach, by writing general algorithms like findIfNot rarely and testing them thoroughly, and then reusing them as much as possible, rather than writing explicit loops.

Should Apple extend Swift to cover this kind of case? Maybe a keyword that causes a closure that doesn’t return a value to be able to pass the return statement out and act on the calling function?³ Until they do, this is definitely a gotcha to be aware of.

Sadly your implementions will always be lacking one feature that if or while have – leaving off the brackets around the predicate. You can’t write unless so it can be used as if pred { } – unless Apple extends “if the last parameter is a closure expression you can have it outside the parens” to be “if there are two parameters and the last is a closure expression, you can leave the parens off the first parameter”. ↩
If you don’t follow what the code here is doing, try reading this introduction to algorithms on collections in Swift. ↩
If I’ve missed something huge here and there’s already a way around this issue, leave a comment! ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

swift-lang