Ruby Gem Sarah Version 2.0.1 Released

Ruby Gem Sarah version 2.0.1 has just been released.

What Is It?

Sarah is a combination sequential array, sparse array, and (“random access”) hash.

Ruby’s own array literal and method calling syntaxes allow you to specify a list of sequential values followed by an either implicit or explicit hash of name/value pairs stored at end of the array. Sarah takes this concept a few steps further.

Values with sequential indexes beginning at 0 are typically stored in the sequential array for efficiency. You can also assign values with non-sequential indexes, and these values are stored in the sparse array (which is actually implemented as a hash). The sequential and sparse arrays work together like a traditional Ruby array, except that there can really be empty holes with no values (as opposed to having nil values as place-holders where no other value has been set in the case of a traditional Ruby array). You can perform most of the typical array operations, including pushing, popping, shifting, unshifting, and deleting. These result in the re-indexing of sparse values in addition to sequential values after the point of insertion or deletion, just as if they had all been stored in a traditional Ruby array.

Values stored with non-integer keys are stored in a separate “random access” (i.e. unordered) hash. Re-indexing of the sequential and sparse arrays does not affect these key/value pairs.

Instead of accessing sparse and random-access values through a hash at the end of the array first, these values all appear at the same level. Compare:

# Traditional Ruby array with implicit hash
a = ['first', 5 => 'second', :greeting => 'hello']
# a[0] = 'first'
# a[1] is a hash
# a[1][5] = 'second'
# a[1][:greeting] = 'hello'

# Using a Sarah
s = Sarah['first', 5 => 'second', :greeting => 'hello']
# s[0] = 'first'
# s[5] = 'second'
# s[:greeting] = 'hello'

Why Should I Use It?

Sarah provides a pure-Ruby sparse array implementation, and can easily be the basis for a pure-Ruby sparse matrix implementation. It also provides efficient linear storage and manipulation in case you don’t know in advance if your data will be sequential or sparse in nature (i.e. it can vary significantly based on user input).

By default, negative indexes are interpreted relative to the end of the array. However, if it’s appropriate to your problem domain, Sarah also has a mode that supports negative indexes as actual indexes. In this mode, insertions and deletions do not result in value re-indexing.

Leave a Reply