In the previous article on extending lazy, we extended the lazy views with first_n
, which returned a sequence of the first n elements of a sequence.
But this isn’t great if what you passed in was a random-access indexed collection. Where possible, it’s nice to retain random access on the output of your lazy function. The lazy map
does this, for example. If you pass an array into lazy
then call map
, the collection you get back is a LazyRandomAccessView
, and you can index randomly into it just like you could into an array.
To do this for first_n
, we need a struct that implements Collection
, takes a Collection
as an initializer plus a count n
, and layers a view over the top of that collection that only exposes the first n
elements.
Or, in the more general case, a view that takes a collection and a sub-range within that collection. Here it is:
struct SubrangeCollectionView<Base: Collection>: Collection { private let _base: Base private let _range: Range<Base.IndexType> init(_ base: Base, subrange: Range<Base.IndexType>) { _base = base _range = subrange } var startIndex: Base.IndexType { return _range.startIndex } var endIndex: Base.IndexType { return _range.endIndex } subscript(idx: Base.IndexType) -> Base.GeneratorType.Element { return _base[idx] } typealias GeneratorType = IndexingGenerator<SubrangeCollectionView> func generate() -> GeneratorType { return IndexingGenerator(self) } }
This sub-range collection view is so useful that I was sure it was somewhere in the Swift standard library. That’s partly why I wrote this article on collection and sequence helpers – as a side-effect of going line by line looking for it. I’m still expecting someone to reply to this article saying “no you fool, you just use X”. If you are that person, call me a fool here.
The Sliceable
protocol seems designed to provide this function. It requires a collection to typealias a SliceType
and implement a subscript(Range)-
>SliceType
method. But you need the collection to implement sliceable, which seems overly restrictive, whereas the view above works for any collection.
SubrangeCollectionView
enables you to pass in a sub-range of any collection to any algorithm that takes a collection or sequence. For example:
let r = 1...10 let halfway = r.startIndex.advancedBy(5) let top_half = SubrangeCollectionView(r, subrange: halfway..<r.endIndex) reduce(top_half,0,+) // returns 6+7+8+9+10
A nice feature is that if you pass a subrange collection into an algorithm that returns an index, such as find
, the index returned can also be used on the base collection. So in the example above, if you ran if let idx = find(top_half,8)
, you could then use idx
on not just top_half[idx]
, but also r[idx]
.
Operating on subranges is a capability that C++ developers, used to the STL, will be missing from Swift collection algorithms. In the STL, operations on containers are performed not by passing the container itself to the algorithm, but instead by passing two iterators on that container defining a range.
For example, here is the definition of the find
algorithm in the C++ STL compared to the Swift version:
// C++ std::find template <class InputIterator, class T> InputIterator find (InputIterator first, InputIterator last, const T& val); // Swift.find func find <C: Collection where C.GeneratorType.Element: Equatable> (domain: C, value: C.GeneratorType.Element) -> C.IndexType?
While Swift.find
takes a collection as its first argument, std::find
takes a first
and last
iterator.
STL container iterators are similar to Swift collection indices, in that they can be forward, bidirectional or random, and are used to move up and down the collection. Like Swift indices, the end iterator points to one past the last element – the equivalent of Swift.find
returning nil
for not found would be std::find
returning end
.
But unlike Swift indices, STL iterators model C pointer-like behaviour, so they are dereferencable. To get the value the iterator points to, you don’t have to have access to the underlying container, using something like Swift’s subscript. To get the value at first
, you call *first
. This is why std::find
doesn’t need to take a container in it’s input. It just increments first
until either it equals last
, or *first
equals val
.
There are lots of reasons why the STL models iterators like this: because it’s like pointers into C arrays; because it dodges memory management issues (which won’t apply to Swift) etc. But also because, with this model, every algorithm will work just as well on a sub-range of a container as on the whole container. You don’t have to pass the starting or ending iterator for the container into find
, you can pass in any two arbitrary points.1
Swift indices, on the other hand, are pretty dumb beasts. They don’t have to know about what value they point to or what collection they index. Heck, the index for Array
is just an Int
.2
For all the advantages of Swift’s just passing in collections as arguments has (it looks a lot cleaner, doesn’t confuse beginners),3 it lacks this ability to apply algorithms to subranges. SubrangeCollectionView
gives you that ability back.
Anyway, back to the original problem, which was to enhance LazyRandomAccessView
with a version of first_n
that returns another LazyRandomAccessView
. With the help of SubrangeCollectionView
, here it is:
extension LazyRandomAccessCollection { func first_n (n: LazyRandomAccessCollection.IndexType.DistanceType) -> LazyRandomAccessCollection<SubrangeCollectionView<LazyRandomAccessCollection>> { let start = self.startIndex let end = min(self.endIndex, start.advancedBy(n)) let range = start..<end let perm = SubrangeCollectionView(self, subrange: range) return lazy(perm) } } let a = [1, 2, 3, 4, 5, 6, 7] let first3 = lazy(a).first_n(3) first3[1] // returns 2
One final interesting question is whether to do the same for forward and bidirectional lazy views as well. The problem here is computing the endIndex
for the view. Unlike with a random-access index, you can’t just advance them by n in constant time. It takes O(n) because the index has to be walked up one by one. For this reason, I’d stick with returning sequences for these.
-
Of course, you can also pass a later iterator to
first
than tolast
, at which point, kaboom! A risk which passing in the container itself eliminates. ↩ -
Of course, you could write an über-index type for your collection, that did understand about what it pointed at. Maybe you could even overload the
*
operator for it to dereference itself. ↩ - And lets face it, you operate on the whole collection 99.9% of the time. ↩
[…] my recent posts, my mind immediately went to making unzip lazy. But before we get to that, let’s muck about […]
[…] about their container would have the side benefit of allowing them to become dereferenceable (see a previous post for more on […]