Collection and Sequence Helpers

Posted on July 28, 2014August 1, 2014 by airspeedvelocity

Before the next installment about laziness, an aside about sequence and collection helpers.

In the previous article, I used a couple of helper objects to generate sequences without having to implement a whole sequence class. The Swift standard library provides quite a few of these, which I describe below.

I’m omitting the view classes returned by lazy, map, filter, enumerate and reverse, as they are tightly associated with their creating function and probably shouldn’t be used as stand-alone things.

CollectionOfOne
When you have a single variable, and you want a collection containing just it (for example, you want to pass it into a function that expects a collection).

CollectionOfOne uses GeneratorOfOne (see below) as its generator.

let i = 42
let just_one = CollectionOfOne(i)
just_one[just_one.startIndex] // returns 42
countElements(just_one) // returns 1
just_one.element // returns 42

EmptyCollection
When you want an empty collection containing none of a specific type. This is only really useful when you specifically don’t want that collection to be an Array for some reason (like in a unit test).

EmptyCollection uses EmptyGenerator (see below) as its generator.

let empty = EmptyCollection<Int>()
countElements(empty) // returns 0

EmptyGenerator
Has no public init, psych! Use GeneratorOfOne, see below.

GeneratorOf: Use Case 1
When you want a generator that serves up elements based on a closure you supply.

var infinite_ones = GeneratorOf { 1 }
infinite_ones.next() // returns 1?
infinite_ones.next() // returns 1?

var i = 0
var naturals = GeneratorOf { ++i }
naturals.next() // returns 1?
naturals.next() // returns 2?
naturals.next() // returns 3?

GeneratorOf: Use Case 2
When you want to treat different generators that generate the same type as all the same type. GeneratorOf has an init that takes other types of generators and returns the same type based only the type of their Element.¹

let r = 1...3
let rg = r.generate()

let a = [11, 22, 33]
let ag = a.generate()

// this won't compile, rg and ag 
// are different types
var generators = [rg, ag]

let g_of_rg = GeneratorOf(rg)
let g_of_ag = GeneratorOf(ag)
// this array will be of type [GeneratorOf<Int>]
var gofgenerators = [g_of_rg, g_of_ag]
gofgenerators[0].next()  // returns 1?
gofgenerators[1].next()  // returns 11?

GeneratorOfOne: Use Case 1
When you want to serve up a single value once from a generator.

let i = 42
var just_one = GeneratorOfOne(i)
just_one.next() // returns 42?
just_one.next() // returns nil

GeneratorOfOne: Use Case 2
When you want to serve up no values from a generator. If you pass in a variable set to nil, it will act like an EmptyGenerator.

let i: Int? = nil
// GeneratorOfOne.init takes T?, unlike 
// CollectionOfOne.init which takes T
var none = GeneratorOfOne(i)
// 'none' is now a GeneratorOfOne<Int>
// that returns no values.
none.next() // returns nil

GeneratorSequence
When you have a generator that isn’t a sequence, but you need a sequence.

let st = stride(from: 1, to: 7, by: 2)
// StrideTo's generator is a rare example
// of a Generator that isn't also a Sequence.
let stg = st.generate()

for i in stg {
    // compiler error, stg
    // is not a Sequence
}

for i in GeneratorSequence(stg) {
    // works OK
}

Be careful! The example above is fine, because the GeneratorSequence is created for a single use and then disposed of. GeneratorSequence doesn’t bestow magical resetting properties on the generator you give it. If you hand over a GeneratorSequence to some other function that requires a Sequence, that function could try and walk the sequence multiple times, which will have undefined results depending on what kind of generator it is. Generators that don’t declare they’re also Sequences probably don’t for a reason.

IndexingGenerator
When you’re implementing a collection, and the generator doesn’t need to do anything more fancy than use the startIndex and subscript methods to serve up each value. Most of the standard library collections use it.

The tiniest working toy collection. Isn’t it cute:

class TenInts: Collection {
    var startIndex: Int { return 0 }
    var endIndex: Int { return 10 }
    subscript(i: Int)->Int { return i }
    typealias GeneratorType = IndexingGenerator<TenInts>
    func generate() -> GeneratorType {
        return IndexingGenerator(self)
    }
}

let ti = TenInts()
reduce(ti, 0, +) // returns 45

PermutationGenerator
When you have a collection, and you want a Sequence or Generator that is made up of elements of that collection served up in an arbitrary order.

let r = 1...10

let tail = PermutationGenerator(elements: r,
            indices: r.startIndex.successor()..<r.endIndex)
// sequence of 2...10

let every_third = PermutationGenerator(elements: r,
                    indices: stride(from: r.startIndex,
                                to: r.endIndex,
                                by: 3))
// sequence of 1, 4, 7

Repeat
When you want a collection of the same values repeated a specific number of times.

You can change the number of times later, but you can’t change the value that repeats.

var finite_ones = Repeat(count: 100,
                    repeatedValue: 1)
countElements(finite_ones) // returns 100
finite_ones.count = 200
countElements(finite_ones) // returns 200
finite_ones.repeatedValue = 2
// compilation error: 
// repeatedValue is read-only

Repeat uses an IndexingGenerator, and then uses endIndex to control how many elements are generated.

SequenceOf: Use Case 1
When you want a sequence whose generator can be based on a closure you supply.

The difference between GeneratorOf and SequenceOf is subtle. GeneratorOf takes a closure that returns the next value. SequenceOf takes a closure that returns a generator, which if you’re following expected behaviour, will be a freshly reset generator. Chances are you’ll want to use GeneratorOf as the return value of your SequenceOf closure:

let naturals = SequenceOf {
    // I can't get the type inference 
    // grooving here for some reason
    _ -> GeneratorOf<Int> in
    // new i to capture for each
    // fresh generator
    var i = 0
    return GeneratorOf {
        ++i
    }
}

// Can now create multiple
// independent generators:
var n1 = naturals.generate()
var n2 = naturals.generate()
n1.next() // returns 1?
n2.next() // returns 1?
n1.next() // returns 2?
n2.next() // returns 2?

Note, you do not have to wrap GeneratorOf inside SequenceOf just to make it a sequence. GeneratorOf implements Sequence. The big difference is, if you need to reset the generator on each call to generate because the closure has state, you need SequenceOf. If your desired sequence is stateless (say it is an infinite sequence of a single constant, or a infinite sequence of the current time), you only need GeneratorOf.

SequenceOf: Use Case 2
Similar to GeneratorOf, when you want to treat different sequences that generate the same type as all the same type. SequenceOf has an init that takes other types of sequences and returns the same type based only the type of their Element.

let a = [1, 2, 3, 4]
let b = lazy(a).filter { $0%2 == 0 }
// this won't compile, a and b are different types
let seqs = [a, b]
// this will, both elements are SequenceOf<Int>
let seqs = [SequenceOf(a), SequenceOf(b)]

So there you go. Hopefully some of these will save you some typing, if you plan on writing your own collections or sequences, or need to generate some makeshift ones on the fly.

My guess is, the second init creates a closure that wraps the supplied generator and returns a GeneratorOf that calls it. That’s probably a pretty interesting idea to play with some other time. ↩

Working at being lazy

Posted on July 27, 2014July 29, 2014 by airspeedvelocity

As teased in the previous article, suppose you want a new kind of lazy evaluation. You want to be able to generate a lazy sequence that represents the first n values of a container.

Here’s how you could add it to LazyRandomAccess:

extension LazyRandomAccessCollection {
    func first_n
      (n: LazyRandomAccessCollection.IndexType.DistanceType)
      -> LazySequence<
           PermutationGenerator<
             LazyRandomAccessCollection,
             Range<LazyRandomAccessCollection.IndexType>
           >
         > {
        let start = self.startIndex
        let end = min(self.endIndex, start.advancedBy(n))
        let range = start..<end
        let perm = PermutationGenerator(elements: self, indices: range)
        return lazy(perm)
    }
}

This uses PermutationGenerator, which is a lazy view class that returns a (sub-)sequence of values from a collection in the order given by another collection of indices into that collection. In first_n’s case, the indices are a range of the first n indices (or the full collection if there are fewer than n elements).

The return type of first_n is a bit crackers so I’ve broken it up over multiple lines. It returns a lazy sequence of a permutation generator of a lazy random access collection, permuted by a range of lazy random access collection indices. Phew.¹ This is why the lazy function and Swift’s type inference is more than just nice to have, it’s essential. The actual types you are using in a simple function can quickly get to a point where there’s no practical way you could declare them by hand.

The practice of returning another lazy object that wraps the results is copied from the lazy map and filter members, which do the same. Why wrap PermutationGenerator in another LazySequence instead of returning it directly? Chaining mostly I think.² Without doing that, you’d have to re-lazy the result if you wanted to run more lazy filters on it:

let r = 1...10
// with lazy wrapper
let l = lazy(r).first_n(5).filter(isOdd)
// without lazy wrapper
let l = lazy(lazy(r).first_n(5)).filter(isOdd)

That works for a collection,³ but what about a sequence? How can we generate a new sequence that only returns the first n values?

Instead of declaring that in one shot, we’ll follow the pattern of map and filter and first declare a view:

struct FirstNSequenceView<Base: Sequence> : Sequence {
    private let _n: Int
    private let _base: Base
    init(_ base: Base, _ n: Int) {
        self._base = base
        self._n = n
    }

    func generate()
      -> GeneratorOf<Base.GeneratorType.Element> {
        var i: Int = 0
        var g = _base.generate()
        return GeneratorOf {
            ++i <= self._n ? g.next() : nil
        }
    }
}

This uses GeneratorOf, which takes for its constructor a closure that serves up values for each call to next(). This helps avoid having to manually write a companion generator class when you implement your own sequence (IndexingGenerator is a similar helper that can be used to conform to Sequence when you implement your own Collection). In this case, generate() creates a closure that captures a counter that will count up, returning elements from the sequence until it hits n. It then returns a GeneratorOf that will call that closure each time next() is called on it. Since each time generate() is called, a new closure with a fresh i will be created, this means FirstNSequenceView conforms to the requirement of Sequence that it can be walked multiple times with multiple independent generators.

Now that we have this view, it’s easy to declare first_n for all the lazy views:

extension LazySequence {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazySequence>> {
        return lazy(FirstNSequenceView(self,n))
    }
}
extension LazyForwardCollection {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyForwardCollection>> {
        return lazy(FirstNSequenceView(self,n))
    }
}
extension LazyBidirectionalCollection {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyBidirectionalCollection>> {
        return lazy(FirstNSequenceView(self,n))
    }
}
extension LazyRandomAccessCollection {
    func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyRandomAccessCollection>> {
        return lazy(FirstNSequenceView(self,n))
    }
}

It’s a bit of a hassle having to implement each one separately, but that’s the price you pay for structs that aren’t in a hierarchy. All the more reason to factor the hard part out into a seperate view object.

Does first_n belong as a lazy view member? Unlike map and filter, it doesn’t take a closure so the laziness is less likely to bite you. Maybe take_while, a filter that returned members of an array until a supplied predicate returned false, would have been a better example. But first_n could take a lazy sequence that itself depended on a closure, so the laziness of first_n could still come as a surprise. Also, the chaining is nice. Does this mean other lazy sequences like enumerate should migrate into being lazy class members? I dunno, maybe.

Without the chaining, you have to wrap the results in a function call. But the function call goes on the left, while chained calls go on the right, which results in a confusing ping-pong effect:

let l = lazy(first_n(lazy(r).filter(isOdd),5)).map(...)

The ordering of this is not obvious at first glance, but it’s filter, then first_n, then map. But then where does it end? Should everything get piled into the lazy classes just to benefit from chaining? Nah. The better solution is to implement a pipe forward operator, of which more some other time.

I have a nagging feeling this could be simplified. The Swift standard lazy functions don’t look quite this bad, they use a type of S instead of a type of themselves for the inner parts of the return values of map and filter. But then the compiler doesn’t completely faithfully represent the generic types in the declarations you see in Xcode (which is why there is a lot of T == T in what we see of the Swift standard library). Let me know if can figure out a better way. ↩
Also, if someone option-clicks the results, it’s obvious at a glance that the results are lazy. ↩
In fact, while a variant of the permutation version would work for forward and bidirectional collections as well, it’s not a good idea, because advance will take O(n) which is unnecessary, as the next version will show. ↩

Lazy by name, lazy by nature

Posted on July 26, 2014March 9, 2015 by airspeedvelocity

When last we discussed lazy collections and sequences, I opened the article with an “ah-hah-hah, this doesn’t do what you might assume” number.¹ What map and filter returned were objects that would evaluate and return results lazily on demand, so when you called them, no mapping or filtering took place right away. Instead it happened later, when you accessed the elements of a MapFilterView and its kin.

Well, turns out Apple decided that cleverly not doing what people might expect isn’t necessarily the best move, so as of beta 4, map and filter return an Array. They still take in collections and sequences of any kind, but an array is what they spit out.²

This is probably for the best. If you didn’t realize map computed lazily, you could be surprised when the results changed each time you iterated over a map using a closure like this:

let r = 1...4
var i = 0
let m = map(r) { $0 * ++i }

for e in m {
    // loops over 1, 4, 9, 16
}
for e in m {
    // loops over 5, 12, 21, 32
}

Even the Swift dev team weren’t immune to the unexpected consequences of lazyness. There were some bugs in using FilterCollectionView to populate an Array, as it took two passes, one to determine the array size needed and another to populate the array. A predicate that returned fewer results on the first pass than the second would result in buffer overrun.

Explicitly lazy

Now, with beta 4, there’s no excuses for getting surprised by lazy evaluation. If you still want to be lazy, you first need to pass your sequence or collection into a call to lazy(), which will give you back a lazy view class. What you get back depends on what you pass in – if you pass in a sequence, you’ll get back a LazySequence. If you pass in a collection, you’ll get back one of the lazy collection structs – either LazyForwardCollection, LazyBidirectionalCollection, or LazyRandomAccessCollection.

These views get progressively more features depending on the capabilities of their base. LazySequence has lazy map and filter methods that work like the old lazy map and filter functions, by returning another LazySequence object.

It also has an array property for crystalizing the lazy results into an array. Should you decide at a later point you want to iterate over the collection more than once, you should use this. If you don’t, duplicate iterations will wastefully re-run the mapping or filtering function over and over.

This also means that you can now safely write this:

var i = 0
let lf = lazy(1...5).filter { _ in
    ++i % 2 == 0
}
let a1 = lf.array
// a1 is [2, 4]
let a2 = lf.array
// a2 is [1, 3, 5]
let a3 = lf.array
// a3 is [2, 4]

LazyForwardCollection only adds subscript, since forward-indexable collections can’t do much more than sequences.

Note, filter still returns a sequence, even when called on the lazy collections, to avoid the heartache described above where other collection constructors assumed they could rely a collection’s length being consistent. The results of map can be a collection, because it returns a value for every element in the base no matter what. That collection inherits the index properties of the base.

LazyBidirectionalCollection and LazyRandomAccessCollection add the ability to reverse lazily. So if you wanted to filter just the first few items starting at the back of a collection, you could call lazy(col).reverse().filter { ... }.

The collection returned by reverse can be used wherever you use a regular collection. If you’re a C++ programmer and you liked the benefits of rbegin/rend, this might be what you’re looking for:

let s = "The cat in the hat"
let rs = lazy(s).reverse()
if let idx = find(rs, "h") {
    // idx points to the h of hat
    // not the h of The
}

The lazy factory

How you get the best lazy view class is pretty cool. lazy is actually 4 overloaded generic functions:

func lazy<S: SequenceType>(s: S) -> LazySequence<S>
func lazy<S: CollectionType where S.Index: ForwardIndexType>(s: S) -> LazyForwardCollection<S>
func lazy<S: CollectionType where S.Index: BidirectionalIndexType>(s: S) -> LazyBidirectionalCollection<S>
func lazy<S: CollectionType where S.Index: RandomAccessIndexType>(s: S) -> LazyRandomAccessCollection<S>

When you call lazy, the Swift compiler picks the most specific overload possible, preferring more specialized inherited protocols over base ones. So CollectionType beats SequenceType, because CollectionType inherits from SequenceType. CollectionType where S.Index: RandomAccessIndexType beats CollectionType where S.Index: BidirectionalIndexType because RandomAccessIndexType inherits from BidirectionalIndexType.³ What is returned is an instance of another generic class, that implements a lazy view on any specific collection or sequence.

I don’t know if there’s an official term for this, but I call it a generic factory. It’s similar to the abstract factory design pattern, in that you call a function to get back one of a range of possible concrete types. But in this case, the type determination happens at compile time, and what you get back is not an implementation of an abstract interface, but the actual appropriate concrete type.

This all feels transparent to the caller because of Swift’s type inference capabilities. You call lazy, passing in your base object, assign the result to a variable, and then merrily start using it. But you aren’t constrained to an interface exposing only the common features of the possible concrete classes, like you would be with an interface and absract factory set-up. If you passed in collection that’s capable of it, you get a reverse method.

Other than help pick the best container type, lazy doesn’t do much. There’s nothing stopping you from declaring the lazy views directly:

let r = 1...500
let l = LazyBidirectionalCollection(r)
let evens = l.filter { $0%2 == 0 }

But if you were implementing your own set of classes generated by this generic factory pattern, you could also put common set-up code in your generic factory method (or even have a generic factory class if needed).

Stride aside

By the way, the new stride function in beta 4 follows a similar pattern of returning different types at compile time from an overloaded function. But in its case, the overloading isn’t done by what you pass in. It isn’t done by types at all.

func stride<T: Strideable>(from start: T, to end: T, by stride: T.Stride) -> StrideTo<T>
func stride<T: Strideable>(from start: T, through end: T, by stride: T.Stride) -> StrideThrough<T>

These two functions differ only by the name of their middle parameter. I don’t know about you, but that this was possible was an eye opener. Score one for the Objective-C named arguments enthusiasts.

Extending lazy

So what if you have your own idea for a lazily evaluated filter to apply to sequences or collections? Well, you could extend the lazy classes to support it. We’ll look at that in the next article. Follow @airspeedswift to catch it.

Which, frankly, was a bit smug of me. ↩
Which, in this authors opinion, means they should kill the Array members map and filter, since they’re now duplicative. They could still be special-cased for performance purposes. ↩
This behaviour can also be used to pick an optimized version of a collection algorithm that takes advantage of random access e.g. to find a subsequence inside a collection. ↩

Changes in the Swift Standard Library in Beta 4

Posted on July 21, 2014July 21, 2014 by airspeedvelocity

Other than access control,¹ no big changes to the language with Beta 4, unlike with Beta 3. But still plenty of changes to the standard library.

Some small bits and pieces:

Float and Double are no longer random access indices. Who was planning on indexing a collection with a Double, anyway? Just how big was your collection going to be?²
uppercaseString and lowercaseString properties are gone from String. I never liked them. Call a function a function, that’s what I say.
insertionSort and quickSort are gone. Generic non-name brand sorts only from now on.
Instead of having static members for true and false, Bool now implements the new BooleanLiteralConvertible protocol
CString is gone. You can still construct strings from c-style strings, but those now take a ConstUnsafePointer. Probably to discourage people from keeping the nasty little things hanging around in their native primordial form.
remainderWithOverflow replaces modulusWithOverflow
Yet more types are reflectable, and there’s now an enum value to represent Objective-C objects separate from Swift ones.

Now the bigger changes…

ArrayBuffer, ArrayBufferType, SliceBuffer and ContiguousArrayBuffer are gone. It was kinda surprising they were there in the first place, revealing the inner workings of Array. Sorry if you spent some time figuring out how to use them, but you probably had no business poking around in there. As a result, the makeUniqe member functions are also gone.

reverse is no longer lazy. It now gives you back an Array. Huh. Looks like if you really want to reverse something lazily, you now need to create a LazyBidirectionalCollection via a call to lazy(), then call .reverse() on it to get another lazy collection. Bit cumbersome, but probably more explicit.

There’s also LazyForwardCollection, LazyRandomAccessCollection, and LazySequence. lazy() will return the appropriate one from your base collection or sequence. All of them give a base the ability to map and filter, plus conversion to an array. Only the bidirectional and random ones can be reversed, obvs. And the random access one has a subscript. I guess I need to update my article on lazy collections and sequences.

Range is no longer sliceable. I don’t really “get” sliceable, but what I do get suggests Range shouldn’t have been sliceable.

Speaking of ranges, were you trying to use Range with floating-point types? Well cut that out and use StrideTo and StrideThrough instead, both created with a call to stride (with either a to or a through as the second argument name). They mirror respectively a half-open and closed version of Range. The big difference being if your stride length takes you past the end, they stop striding.

The version of filter that took a collection not a sequence is gone. It was a bit pointless anyway, since you can’t index randomly into a filtered collection without doing all the filtering up to that point. And the sequence version is no longer lazy, it returns an Array. If you want to filter lazily, use the explicitly lazy stuff above. If you want to filter starting from the back, you can do lazy().reverse().filter()

As ever, I refrain from commenting on anything relating to Unicode and Objective-C bridging, for fear of demonstrating my ignorance.

About which I appear to be out of the mainstream in not being desperate for. I mean it’s important before go-live, sure. But some people seem to need a cold shower after hearing it’s in this beta. ↩
Or how small! No, wait… ↩

On dictionaries and initializers, in which I give up and use a for loop

Posted on July 18, 2014July 20, 2014 by airspeedvelocity

On twitter, @rob_rix confessed that he’d given up and used a for loop. @rbrockerhoff replied he had too, in his case because you can’t create a dictionary from an array.

That tweet sent me deep down the rabbit hole experimenting with Dictionary. I keep wanting to fix it, like a scruffy boyfriend. But like with said boyfriend, that’s easier said than done.

Warning: some of the below is very experimental, thoroughly likely to crash your playground, and quite possibly wrongheaded in lots of places. If you spot any of the latter, do tweet me a correction @airspeedswift.

The root of the problem is that, unlike Array, Dictionary doesn’t have an initializer that takes a Sequence of key/value tuples. If it did, you could do cool stuff like this:

let words = ["zero", "one","two"]
let enumerated = enumerate(words)
let number_to_word = Dictionary(enumerated)
// results in [0:"zero", 1:"one", 2:"two"]

let d = [0:"zero", 1:"one", 2:"two"]
let flipped = Dictionary(Zip2(d.values, d.keys))
// results in ["zero":0, "one":1, "two":2]

Now, if you root around in the standard library¹ you’ll find it does have this function, which it implements as part of the DictionaryLiteralConvertible protocol:

static func convertFromDictionaryLiteral
  (elements: (KeyType, ValueType)...) 
  -> [KeyType : ValueType]

This is tantalizingly close to what we want, at least for arrays if not sequences. It takes a variadic argument of tuples, which will in the function itself be represented as an array of tuples. Hey, we want to create a dictionary from an array of tuples! Can we use this?

Nope. Try as you might, you can’t shove an array through that function. If you managed to get one in, inside the function it would appear as an array inside an array. Languages like Ruby and Python have the ability to “splat” an array into a function as its parameters, but (as far as I can tell) there’s no way to do that in Swift.

Sigh. I guess there’s no alternative but to use a loop. But to make myself feel better about it, I’ll fix something else that bugs me about Dictionary. As we saw, convertFromDictionaryLiteral takes an array of (Key, Value) tuples. But the one function in Dictionary that also takes a key and a value takes them the other way around:

func updateValue(value: ValueType, forKey key: KeyType) -> ValueType?

You know how I said you can’t pass an array into a variadic argument in Swift? Well, you can pass a tuple into a function that has the same number of arguments as the tuple:

func takeTwoArgs(arg1: Int, arg2: String) {
//  do stuff with arg1 and arg2
}

let pair = (1, "one")
takeTwoArgs(pair)

To take an array of tuples and insert them into a dictionary would be a lot nicer if you could just pass the elements of the array in directly, rather than having to flip them around. This will do it:

extension Dictionary {
    // offensive vague naming of a function alert...
    mutating func update
      (key: KeyType, _ val: ValueType) 
      -> ValueType? {
        return self.updateValue(val, forKey: key)
    }
}

let pairs = [(0,"zero"), (1,"one"), (2,"two")]
var d: [Int:String] = [:]
for pair in pairs {
  d.update(pair)
}

OK, doing that makes me feel a bit more cheerful, but I’d still like to be able to construct a dictionary from scratch using an initializer. This isn’t just a whim, by the way – having to do the for loop above forces you to declare d with var, whereas an initializer could be used with let.

So how about we extend Dictionary with an initializer:

extension Dictionary {
  init(_ array: [Element]) {
    for pair in array {
      self.update(pair)
    }
  }
}

Oops, a mysterious compiler error. “Variable 'self._variantStorage' used before being initialized“. We’ve accidentally been exposed to the inner implementation of Dictionary. And unlike Array, which has some hints at how you might manipulate the underlying store with structs like ArrayBuffer, Dictionary is entirely enigmatic. We could try and figure out how to initialize _variantStorage by hand but that’s just going to lead to tears.

Or, we could just create another Dictionary, populate it, and then assign it to self right at the end of the initializer:

edit: Ignacio in the comments points out the (now blindingly obvious) alternative of just calling self.init() instead – thanks! But I leave the self-assignment here as something of interest that works.

edit edit: Maybe not so fast. self.init() works in playground, but not in compiled code right now.

extension Dictionary {
    init(_ array: [Element]) {
        var d: [KeyType:ValueType] = [:]
        for pair in array {
            d.update(pair)
        }
        self = d
    }
}

I have no idea if this is supposed to be allowed or not. I mean, it seems to work fine, but I think I might have voided the warranty. Certainly I’m going to think twice about filing this code as part of a radar.

Nevertheless, we can now do this:

let pairs = [(0,"zero"), (1,"one"), (2,"two")]
let d = Dictionary(pairs) // woo-hoo
// d now contains [0:"zero", 1:"one", 2:"two"]

For my last trick, I would love to say I’ve generalized this to work for sequences, but unfortunately I still can’t figure that one out. This seems like it ought to be the solution:

extension Dictionary {
    init<S: Sequence
      where S.GeneratorType.Element == Element>
      (_ seq: S) {
        var d = [KeyType:ValueType] = [:]
        for pair in seq {
            d.update(pair)
        }
        self = d
    }
}

but it segfaults the compiler (I need to find a way to reproduce this without the dodgy self-assignment, so I can file a radar with less silly code…) (edit: well, now I know a way! Was hoping that would fix the segfault, but no)

As a workaround, I’ve tried creating an array with the sequence and then using the previous initializer:

extension Dictionary {
    init<S: Sequence
        where S.GeneratorType.Element == Element>
        (_ seq: S) {
            var d = Dictionary(Array(seq))
            self = d
    }
}

However, this leads to the compiler compliaining that “'WhateverYourSequenceTypeIs' is not convertible to '[(KeyType, ValueType)]'“. This suggests it’s not matching the sequence initializer. If anyone can spot what’s wrong here and has a fix, let me know. For now, just stick an Array() around your init arguments and pretend you’re in for-loop free nirvana.

edit: found it! Read the next post for a solution.

In case you don’t know how to do that, type import Swift into a playground then command-click on the word Swift. ↩

Always write a generalized version of your algorithm

Posted on July 12, 2014July 14, 2014 by airspeedvelocity

Suppose you want to find the minimum value in an array. Well, the Swift standard library has you covered:

let a = [6, 2, 4, 7, 8, 10]
minElement(a)  // returns 2

Ok that’s cool. But suppose you have an array of strings of integers and you want the minimum numeric value.

Well, I guess you could map them to Int and then use the same function:

let a = ["6", "2", "4", "7", "8", "10"]
minElement( a.map { $0.toInt() })

Nope, compiler error. String.toInt returns an Int?, because maybe the string isn’t a number.

Fair enough. I happen to know with my intimate knowledge of the problem domain that we can just ignore non-numeric values. So maybe filter those out?

minElement( a.map { $0.toInt() }.filter { $0 != nil } )  // hmmm

Nope, more compiler errors, because now we have only non-nil values, but they’re still optionals and an Int? doesn’t conform to Comparable. So maybe another map to force-unwrap the results (force unwrap is OK because the filter will guarantee no non-nil elements):¹

minElement( a.map{ $0.toInt() }.filter { $0 != nil }.map { $0! } )  // yuck

Ok that compiles now and does return the minimum numeric value. But what if we wanted the result to still be a string?

"\(minElement( a.map{ $0.toInt() }.filter { $0 != nil }.map { $0! } ))" // bleurgh

That’s pretty horrible, and at this point I’m thinking of registering GiveUpAndUseAforLoop.com,² and Marco Arment is yelling at you kids with your new language to get off his lawn.

Really, what we need is for minElement to take a predicate instead of only using the default <. It doesn't (mutter, grumble) but if it did, we could write this:

minElement(a) {
    switch ($0.toInt(), $1.toInt()) {
    case (_, nil): return true
    case (nil, _): return false
    case (let i, let j): return i < j
    }
}

Ahh. Faith in the functional programming style restored.³ This takes each element, checks if the toInt() of either is nil using pattern matching, always preferring the non-nil version, and if neither are, compares them. Since all it's doing is comparing the converted value, not returning one, the minimum returned is a string.⁴

There's nothing stopping you from implementing your own overload of Apple's minElement that takes a predicate:⁵

func minElement
  <E, R: Sequence
  where E == R.GeneratorType.Element>
  (range: R, pred: (E,E)->Bool) -> E {

    var gen = range.generate()
    var min_so_far = gen.next()!

    while let next = gen.next() {
        if(pred(next,min_so_far)) {
            min_so_far = next
        }
    }

    return min_so_far
}

This version copies the Swift library version's questionable practice of exploding in flames if you pass it an empty sequence – the (better?) alternative being make the return value optional to force the caller to acknowledge this possibility.

The moral here is that if you ever write a high-order function, write it as generally as possible. It's only a few extra lines to declare the common case using the general case:⁶

func minElement
  <E: Comparable, R: Sequence
  where E == R.GeneratorType.Element>
  (range: R) -> E {
    return myMinElement(range, <)
}

Apple does this for most functions (such as sort), just not minElement. But then it is only beta 3, hang in there.

“I thought it was guaranteed not to be nil at this point” is what you could write to the log file in the outer exception handler you can’t have. ↩
Just kidding, I didn’t think about it, I registered it straight away. ↩
This code would be even neater if the Swift switch statement would implicitly return the values of single-line cases, i.e. if all the returns in that closure were removed, the switch would evaluate to the value of the selected case statement, and the closure would return it. They closed my radar as a dupe of 16169366 (don't file another duplicate though). ↩
It's still not perfect – if none of the elements are numeric, it'll return the first one. Also, it'll convert each element twice. Hold that thought for a later post though. ↩
Actually, there might be some compiler bugs stopping you, but I'll leave you to discover those on your own. ↩
If the language would let you, you could even default the last parameter to <, but it won't, so you can't. More dynamic default arguments (including ones derived from other arguments to the same function) could be pretty useful in cutting down the amount of code you need to write for these general- and special-case versions. ↩

Swift’s Lazy Collections and Sequences

Posted on July 10, 2014March 9, 2015 by airspeedvelocity

Warning: this article is out of date as of Swift 1.0 beta 4. Read this updated version instead.

The Swift non-member function map acts much like the Array.map member function, except you can pass in any kind of sequence or collection (click here for an article on how to write an algorithm like this).

But if you look at what map returns, it isn’t an array of results. It’s an object (a MapColectionView or a MapSequenceView).

That’s because map evaluates its results lazily. When you call map, no mapping takes place. Instead, the input and mapping function are stored for later, and values are only mapped as and when they are needed.

To see this happening, run the following code:

let r = 1...3
let mapped = map(r) {
    (i: Int) -> Int in
    println("mapping (i)")
    return i*2
}

println("crickets...")

for i in mapped {
    println("mapped (i)")
}

println("tumbleweed...")

println("index 2 = (mapped[2])")

// prints out
crickets...
mapping 1
mapped 2
mapping 2
mapped 4
mapping 3
mapped 6
tumbleweed...
mapping 2
index 2 = 4

The returned object from map mimics the properties from the object passed in. If you pass in a sequence, then you get back a sequence. If the collection you passed in only supported a bidirectional index (such as a Dictionary or a String, which can’t be indexed randomly), then you won’t be able to subscript arbitrarily into the mapped collection, you’ll have to walk through it.

map is not the only lazy function. filter and reverse also return these “view” classes. Combining these different lazily evaluated classes means you can build up quite complicated expressions without sacrificing efficiency.

Want to search backwards through an array or string? find(reverse(myarray), value) won't actually reverse the array to search through it, it just iterates over the ReverseView class, which serves up the array in reverse order.

Want the first 3 odd numbers that are a minimum of two arrays at each position? first_n(filter(map(Zip2(a1, a2), min) {$0%2 == 1}, 3) wont do any more zipping or minning or modding than necessary.¹²³

There are also several lazily generated sequences you can also use:

enumerate numbers each item in a collection — it returns an EnumerateGenerator, that serves up those numbered elements on demand. This is useful in a for (count, val) in enumerate(seq) loop where you want to know both the value and how many values so far.

PermutationGenerator is initialized with a collection and a collection of indices into that collection, and serves up each element at each index in that order (thus allowing you to lazily reorder any collection – for example, PermutationGenerator(elements: a, indices: reverse(a.startIndex..<a.endIndex)) would be the reverse of a.⁴

GeneratorOf just takes a closure and runs it each time to generate the next value. So var i = 0;var naturals = GeneratorOf({ ++i }) sets naturals to a sequence that counts up from 1 to infinity (or rather, to overflow).

These infinite virtual sequences can be pretty useful. For example, to do the same thing as enumerate, you could write Zip2(naturals, somecollection). Or you could pass your infinite sequence into map or filter to generate further inifinite sequences.

(at the opposite end, if you have a single value, but what a function needs is a Sequence, you can use GeneratorOfOne(value) to create a quick sequence that will just serve up just your one value)

It's not all rainbows and unicorn giggles though. Imagine you did the following:

let r = 1...10_000
mapped = map(r) { megaExpensiveOp($0) }

let a1 = mapped
let a2 = mapped

Here the lazy evaluation will work against you. megaExpensiveOp will be run not ten but twenty thousand times.

“Shouldn't map cache the data?” you ask. Well that leads to the next complication. Take this code:

var i =0
let mapped = map(1...5) {
    i += 1
    return $0 + i
}

Every time you iterate over mapped now, you'll get different results.⁵

This behaviour might be put to very good use (say you wanted to peturb a data set with small random values). But if you weren't expecting it, that could be one nasty bug to track down.

So its good to know when you're using these functions what they are actually doing under the hood. Just keep in mind these caveats and an exciting life of lazy evaluation awaits you in the off-world colonies.

OK I cheated, first_n isn't a Swift standard library function. But it should be! ↩
This code would look a lot nicer if Swift came with a pipe operator ↩
Yeah, this is a pretty rubbish example, I should have put the time in to think of a better real-life use case. ↩
Obviously, reverse(a) would work better, but a more useful example is tough to fit on one line… ↩
If you're unclear on how come, try reading this article. ↩

Changes to the Swift Standard Library in Beta 3

Posted on July 8, 2014July 9, 2014 by airspeedvelocity

Xcode Beta 3 came out today, and with it a new version of the Swift standard library.

By far the majority of changes are to account for the publicized updates to the language:

replacing Type[] with [Type], and making use of the new [Key:Val] sugar for dictionaries
switching uncheckedArithOp for arithOpWithOverflow in all the numeric types
adding explicit bigEndian and littleEndian initializers to the integers
replacing succ and pred with successor and predecessor¹
adding sorted as an operation on arrays and as a standalone algo, since sort now sorts in-place²

copy and unshare have been removed from Array, since they’re redundant now arrays have full value semantics. Assigning an array to another variable copies it. Or rather, it lazily copies it – the array will only by physically copied when you alter one of the elements. So really, the equivalent of unshare is called automatically when any value in the copied array or the copy is altered. You can still detect when this happens with the === operator, which returns true if two arrays are pointing to the same underlying store:

var a = [1, 2, 3, 4]
let b = a  // 'a' is "copied"
// but really, it's still the same
a === b    // returns true
a[2] = 2   // until you alter 'a'
// and now they're different
a === b    // returns false

This is really important to understand, in case you start to panic about all the unnecessary copying that might happen when you pass an array into a function as a parameter.

There are also a handful of other changes/additions that aren’t mentioned in the release notes.

Range now supports map as a member, so you can write (1...5).map { $0 * 2 } much like with an Array, rather than map(1...5) { $0 * 2 }. Note, Range.map returns an Array, unlike the map algorithm, which returns a lazily-computed view class.

Lots more types (such as Range) now have a getMirror() method.

min and max, which previously took two or more parameters (the “more” part being a variadic parameter) now have an overload with just two parameters. This means you can pass them as an argument to another function that expects a binary operator. So the following now compiles where previously it would throw an error:³

let a = [5, 3, 1, 7, 9]
let b = [2, 4, 1, 1, 7]

// now you can write this:
let mins = map(Zip2(a, b), min)

// instead of being forced to write this:
let mins = map(Zip2(a, b)) { min($0, $1) }

Finally, the already mysterious insertionSort got a bit more mysterious.

Previously there were two versions, both of which required a collection with a bidirectional index. One just required that the collection contents be comparable. The other allowed the caller to supply a predicate to compare elements with. That’s a pretty standard pattern, but oddly that predicate is declared inout. Why a sort algorithm needs to be able to change the caller’s comparison function I do not know, but it can!⁴

The new insertionSort that doesn’t take a predicate now requires a random-access index, rather than a bidirectional index. The version that takes the predicate still only requires a bidirectional index. It’s predicate is still inout though.

There are a few other changes, relating to unicode character processing and Objective-C interfacing, that I either don’t feel qualified to comment on or haven’t delved into. Feel free to mention antyhing interesting about them in the comments.

Bit odd, that, I preferred the shorter versions. ↩
I really hate they did this to the non-member version, as it means you can’t chain it. ↩
If you don’t follow what this does – Zip2 takes two sequences and “zips” them together, producing a new sequence that has pairs of elements from the same point in both sequences. map then takes each element of this sequence and applies a function to each one. That function, min, takes the pairs and returns the minimum. So the whole expression returns a sequence of the minimum of each position in either array. ↩
It does mean you can’t call it with a trailing closure expression. You have to create a variable and point that at a function and then pass that variable in. That can’t be the reason, can it? ↩

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

	Sorting Nibbles in S… on Sorting Nibbles in Swift
	Collection Data Stru… on Arrays, Linked Lists and …
	Swift化零为整：Reduce 详解… on Arrays, Linked Lists and …
	Writing A Generic St… on Generic Collections, SubSequen…
	Writing A Generic St… on Collection Indices, Slices, an…

Airspeed Velocity

African or European Swift?

swift standard library