Login with OpenID

Lazy Loading and Data Mappers

Written by Hector Virgen
Published on January 18, 2010
Last updated on February 5, 2010

Lazy loading is a simple yet powerful tool in any developer's tool box, and its knack for procrastination is especially useful in domain modeling.

Let's say we're building a simple blogging application, and each blog post can have 0 or more comments. So we may have models like this:

class Blog
{
    protected $_title;
    protected $_body;
    protected $_comments = array();
}

class Comment
{
    protected $_blog;
    protected $_from;
    protected $_body;
}

When developing your data mapper, you may intuitively want to load all the comments for the blog post when loading a blog from the database. However, you may not need all the comments. For example, you may simply be showing a list of all blog posts and are not displaying the comments at all. It would be unnecessary to load the comments, but your data mapper won't know how much information is needed when you request the object.

This is where lazy loading can help. To solve this, you'll want to crate a lazy-loading iterator. Initially, this iterator would be given the information it needs to build the collection, without actually building the collection itself.

So what kind of information does the iterator need? Only two things: a data mapper class (or instance), and a list of IDs. When the iterator is iterated, the instance is fetched by calling find() on the mapper with the current iteration's ID. Here's an example:

class CommentCollection implements SeekableIterator
{
    protected $_ids = array();
    protected $_mapper;
    protected $_instances = array();
    protected $_position = 0;

    public function setIds(array $ids)
    {
        $this->_ids = $ids;
        $this->_instances = array();
    }

    public function getIds()
    {
        return $this->_ids;
    }

    public function setMapper($mapper)
    {
        $this->_mapper = $mapper;
    }

    public function getMapper()
    {
        if (is_string($this->_mapper)) {
            $this->_mapper = new $this->_mapper;
        }
        return $this->_mapper;
    }

    public function key()
    {
        return $this->_position;
    }

    public function next()
    {
        ++$this->_position;
    }

    public function rewind()
    {
        $this->_position = 0;
    }

    public function valid()
    {
        return array_key_exists($this->_position, $this->_ids);
    }

    public function seek($position)
    {
        $this->_position = (int) $position;
    }

    public function current()
    {
        if (array_key_exists($this->_position, $this->_instances)) {
            $this->_instances[$this->_position] = $this->getMapper()->find($this->_ids[$this->_position]);
        }
    return $this->_instances[$this->_position];
}
}

Now, instead of passing an array of fully instantiated comments to your Blog, you can pass in this iterator. But, as you may have noticed, in order for this to work, the iterator must have a list of comment IDs. Since we may not be displaying the comments at all, let's take this one step further and make the comment IDs lazy-loaded, too. In order to do this, we'll create a new class that uses this one. But instead of giving it an array of IDs, we'll give it a callback function that it can use when first iterating through it.

class CommentCollectionLoader implements SeekableIterator
{
    protected $_collection;
    protected $_mapper;
    protected $_arguments = array();
    protected $_method;

    public function setIds(array $ids)
    {
        $this->getCollection()->setIds(array $ids);
    }

    public function getIds()
    {
        return $this->getCollection()->getIds();
    }

    public function setMapper($mapper)
    {
        $this->_mapper = $mapper;
    }

    public function getMapper()
    {
        if (is_string($this->_mapper)) {
            $this->_mapper = new $this->_mapper;
        }
        return $this->_mapper;
    }

    public function setMethod($method)
    {
        $this->_method = $method;
    }

    public function getMethod()
    {
        return $this->_method;
    }

    public function setArguments(array $arguments)
    {
        $this->_arguments = $arguments;
    }

    public function getArguments()
    {
        return $this->_arguments;
    }

    public function getCollection()
    {
        if (null === $this->_collection) {
            $this->_collection = new CommentCollection();
            $ids = call_user_func_array(array($this->getMapper(), $this->_method), $this->_arguments);
            $this->_collection->setIds($ids);
        }
        return $this->_collection;
    }

    public function key()
    {
        return $this->getCollection()->key();
    }

    public function next()
    {
        $this->getCollection()->next();
    }

    public function rewind()
    {
        $this->getCollection()->rewind();
    }

    public function valid()
    {
        return $this->getCollection()->valid();
    }

    public function seek($position)
    {
        $this->getCollection()->seek($position);
    }

    public function current()
    {
        return $this->getCollection()->current();
    }
}

The difference is that now this new collection can be instantiated and passed in directly to the Blog instance without invoking any additional SQL queries until the very moment you need it.

$blog = new Blog();
$comments = new CommentCollectionLoader();
$comments->setMapper('CommentMapper');
$comments->setMethod('findByBlog');
$comments->setArguments(array($blog));
$blog->setComments($comments);

foreach ($blog->getComments as $comment) {
    assert($comment instanceof Comment); // true
}

By following this pattern throughout your data mappers, you can effectively traverse throughout the entire domain from just a single instance, and only the required queries will run.

Comments

blog comments powered by Disqus