Wayne Conrad's Blog

home

More ConcatenatingEnumerator

10 Mar 2015

In a previous article I introduced a Concatenating Enumerator. Today I found a use for it in production code. It could be that I just had a hammer and was looking for a nail, but I think the code came out really clean.

ConcatenatingEnumerator

Here’s the concatenating enumerator. It lets you glue any number of enumerable things together and treat them as a single enumerator:

class ConcatenatingEnumerator < Enumerator
  def initialize(enumerators = [])
    super() do |yielder|
      enumerators.each do |enumerator|
        enumerator.to_enum.each do |item|
          yielder.yield item
        end
      end
    end
  end
end

It can be used like this:

enum1 = [1, 2, 3]
enum2 = [4, 5]
enum = ConcatenatingEnumerator.new([enum1, enum2])
p enum.to_a    #=> [1, 2, 3, 4, 5]

This example uses #to_a, but any of the usual enumeration methods will work on a ConcatenatingEnumerator: #each, #first, #map, and so on.

Since ConcatenatingEnumerator calls #to_enum on its arguments, it will take either enumerators, or anything that can be treated as an enumerator (like Array). (Calling #to_enum is a change from the previous version of ConctatenatingEnumerator).

Using ConcatenatingEnumerator to read log files

The main method

The application parses one or more log files and prints some statistics. Here’s the program’s main method:

logs = Logs.new(@args.paths)
stats = Stats.new
parser = Parser.new(stats)
logs.each do |line|
  parser.parse(line)
end
StatsPrinter.print(stats)

Even though this program reads multiple log files, the main method isn’t concerned with that. The Logs class treats all the log files as a single enumeration. This makes the main method’s life easy, and its logic is easy to follow. Only the highest level of abstraction is visible here.

Logs

Logs is perfectly simple:

class Logs < ConcatenatingEnumerator

  def initialize(paths)
    logs = paths.map { |path| Log.new(path) }
    super(logs)
  end

end

We turn the log paths into instances of Log. A Log is enumerable, so ConcatenatingEnumerator can glue them together into a single enumeration.

Log

A Log is more interesting, because we want to open each file as needed, and close it as soon as possible. That way, the program only needs to have one file open at a time.

class Log

  def initialize(path)
    @path = path
  end

  def to_enum
    Enumerator.new do |yielder|
      File.open(@path, "r") do |file|
        file.each_line do |line|
          yielder.yield(line)
        end
      end
    end
  end

end

#to_enum is what lets the ConcatenatingEnumerator use this object. The method returns an enumerator. When that enumerator is used (and not before), the file is opened, and each line is yielded in turn. Once all of the lines are yielded, the file is closed.

The real program does more

This code teeters on the edge of too much abstraction. There’s a lot of mechanism being used to simply read some log files one after the other. The main method could be more like this, eliminating the Logs class and the use of ConcatenatingEnumerator:

stats = Stats.new
parser = Parser.new(stats)
@args.paths.each do |path|
  Log.open(path) do |log|
    log.each do |line|
      parser.parse(line)
    end
  end
end
StatsPrinter.print(stats)

But I prefer the more minimal main method that ConcatenatingEnumerator makes possible. You may reasonably disagree.

comments powered by Disqus