* Changed the default behavior for AR eager loading so that it generates and executes a single SQL statement

This commit is contained in:
qiang.xue
2009-06-13 18:34:44 +00:00
parent 6b6f246d48
commit 79178ec2a4
8 changed files with 100 additions and 55 deletions

View File

@@ -7,6 +7,7 @@ Version 1.1a to be released
- New: Refactored scenario-based validation and massive assignments (Qiang)
- New: Added CDbSchema::checkIntegrity() and resetSequence() (Qiang)
- New: Added phpunit-based testing framework (Qiang)
- Chg: Changed the default behavior for AR eager loading so that it generates and executes a single SQL statement (Qiang)
Version 1.0.7 to be released
----------------------------

View File

@@ -45,7 +45,6 @@ class Post extends CActiveRecord
'author'=>array(self::BELONGS_TO, 'User', 'authorId'),
'comments'=>array(self::HAS_MANY, 'Comment', 'postId', 'order'=>'??.createTime'),
'tagFilter'=>array(self::MANY_MANY, 'Tag', 'PostTag(postId, tagId)',
'together'=>true,
'joinType'=>'INNER JOIN',
'condition'=>'??.name=:tag'),
);

View File

@@ -215,35 +215,6 @@ categories. It will also bring back each author's profile and posts.
> Note: The usage of the [with()|CActiveRecord::with] method has been changed
> since version 1.0.2. Please read the corresponding API documentation carefully.
The AR implementation in Yii is very efficient. When eager loading
a hierarchy of related objects involving `N` `HAS_MANY` or `MANY_MANY`
relationships, it will take `N+1` SQL queries to obtain the needed results.
This means it needs to execute 3 SQL queries in the last example because of
the `posts` and `categories` properties. Other frameworks take a more
radical approach by using only one SQL query. At first look, the radical approach
seems more efficient because fewer queries are being parsed and executed by
DBMS. It is in fact impractical in reality for two reasons. First, there
are many repetitive data columns in the result which takes extra time to
transmit and process. Second, the number of rows in the result set grows
exponentially with the number of tables involved, which makes it simply
unmanageable as more relationships are involved.
Since version 1.0.2, you can also enforce the relational query to be done with
only one SQL query. Simply append a [together()|CActiveFinder::together] call
after [with()|CActiveRecord::with]. For example,
~~~
[php]
$posts=Post::model()->with(
'author.profile',
'author.posts',
'categories')->together()->findAll();
~~~
The above query will be done in one SQL query. Without calling [together|CActiveFinder::together],
this will need two SQL queries: one joins `Post`, `User` and `Profile` tables,
and the other joins `User` and `Post` tables.
Relational Query Options
------------------------
@@ -292,7 +263,11 @@ from `aliasToken` in that the latter is just a placeholder and will be
replaced by the actual table alias.
- `together`: whether the table associated with this relationship should
be forced to join together with the primary table. This option is only meaningful for HAS_MANY and MANY_MANY relations. If this option is not set or false, each HAS_MANY or MANY_MANY relation will have their own JOIN statement to improve performance. This option has been available since version 1.0.3.
be forced to join together with the primary table. This option is only meaningful
for HAS_MANY and MANY_MANY relations. If this option is set false,
each HAS_MANY or MANY_MANY relation will have their own JOIN statement,
which may improve overall query performance. For more details, see the section
"Relational Query Performance". This option has been available since version 1.0.3.
- `group`: the `GROUP BY` clause. It defaults to empty. Note, column
references need to be disambiguated using `aliasToken` (e.g. `??.age`).
@@ -344,6 +319,7 @@ use a placeholder to indicate the existence of a column which needs to be
disambiguated. AR will replace the placeholder with a suitable table alias
and properly disambiguate the column.
Dynamic Relational Query Options
--------------------------------
@@ -372,6 +348,67 @@ $posts=$user->posts(array('condition'=>'status=1'));
~~~
Relational Query Performance
----------------------------
As we described above, the eager loading approach is mainly used in the scenario
when we need to access many related objects. It generates a big complex SQL statement
by joining all needed tables. A big SQL statement is preferrable in many cases
since it simplifies filtering based on a column in a related table.
It may not be efficient in some cases, however.
Consider an example where we need to find the latest posts together with their comments.
Assuming each post has 10 comments, using a single big SQL statement, we will bring back
a lot of redundant post data since each post will be repeated for every comment it has.
Now let's try another approach: we first query for the latest posts, and then query for their comments.
In this new approach, we need to execute two SQL statements. The benefit is that there is
no redundancy in the query results.
So which approach is more efficient? There is no absolute answer. Executing a single big SQL statement
may be more efficient because it causes less overhead in DBMS for yparsing and executing
the SQL statements. On the other hand, using the single SQL statement, we end up with more redundant data
and thus need more time to read and process them.
For this reason, Yii provides the `together` query option so that we choose between the two approaches as needed.
By default, Yii adopts the first approach, i.e., generating a single SQL statement to perform
eager loading. We can set the `together` option to be false in the relation declarations so that some of
tables are joined in separate SQL statements. For example, in order to use the second approach
to query for the latest posts with their comments, we can declare the `comments` relation
in `Post` class as follows,
~~~
[php]
public function relations()
{
return array(
'comments' => array(self::HAS_MANY, 'Comment', 'postID', 'together'=>false),
);
}
~~~
We can also dynamically set this option when we perform the eager loading:
~~~
[php]
$posts = Post::model()->with(array('comments'=>array('together'=>false)))->findAll();
~~~
> Note: In version 1.0.x, the default behavior is that Yii will generate and
> execute `N+1` SQL statements if there are `N` `HAS_MANY` or `MANY_MANY` relations.
> Each `HAS_MANY` or `MANY_MANY` relation has its own SQL statement. By calling
> the `together()` method after `with()`, we can enforce only a single SQL statement
> is generated and executed. For example,
>
> ~~~
> [php]
> $posts=Post::model()->with(
> 'author.profile',
> 'author.posts',
> 'categories')->together()->findAll();
> ~~~
>
Statistical Query
-----------------

View File

@@ -24,3 +24,15 @@ used, instead.
- Removed CModel::getValidatorsForAttribute(). Please use CModel::getValidators() instead.
- Removed CHtml::scenario
Changes Related with Eager Loading for Relational Active Record
---------------------------------------------------------------
- By default, a single JOIN statement will be generated and executed
for all relations involved in the eager loading. If the primary table
has its `LIMIT` or `OFFSET` query option set, it will be queried alone
first, followed by another SQL statement that brings back all its related
objects. Previsoulay in version 1.0.x, the default behavior is that
there will be `N+1` SQL statements if an eager loading involves
`N` `HAS_MANY` or `MANY_MANY` relations.

View File

@@ -32,7 +32,7 @@ class CActiveFinder extends CComponent
* This property is internally used.
* @since 1.0.2
*/
public $baseLimited;
public $baseLimited=false;
private $_joinCount=0;
private $_joinTree;
@@ -56,20 +56,16 @@ class CActiveFinder extends CComponent
/**
* Uses the most aggressive join approach.
* By default, several join statements may be generated in order to avoid
* fetching duplicated data. By calling this method, all tables will be joined
* together all at once.
* @param boolean whether we should enforce join even when a limit option is placed on the primary table query.
* Defaults to true. If false, we would still use two queries when there is a HAS_MANY/MANY_MANY relation and
* the primary table has a LIMIT option. This parameter is available since version 1.0.3.
* By calling this method, even if there is LIMIT/OFFSET option set for
* the primary table query, we will still use a single SQL statement.
* By default (without calling this method), the primary table will be queried
* by itself so that LIMIT/OFFSET can be correctly applied.
* @return CActiveFinder the finder object
* @since 1.0.2
*/
public function together($ignoreLimit=true)
public function together()
{
$this->joinAll=true;
if($ignoreLimit)
$this->baseLimited=false;
return $this;
}
@@ -393,9 +389,9 @@ class CJoinElement
if($this->_parent===null) // root element
{
$query=new CJoinQuery($this,$criteria);
if($this->_finder->baseLimited===null)
$this->_finder->baseLimited=($criteria->offset>=0 || $criteria->limit>=0);
$this->_finder->baseLimited=($criteria->offset>=0 || $criteria->limit>=0);
$this->buildQuery($query);
$this->_finder->baseLimited=false;
$this->runQuery($query);
}
else if(!$this->_joined && !empty($this->_parent->records)) // not joined before
@@ -445,12 +441,12 @@ class CJoinElement
{
$query->limit=$child->relation->limit;
$query->offset=$child->relation->offset;
if($this->_finder->baseLimited===null)
$this->_finder->baseLimited=($query->offset>=0 || $query->limit>=0);
$this->_finder->baseLimited=($query->offset>=0 || $query->limit>=0);
$query->groups[]=str_replace($child->relation->aliasToken.'.',$child->tableAlias.'.',$child->relation->group);
$query->havings[]=str_replace($child->relation->aliasToken.'.',$child->tableAlias.'.',$child->relation->having);
}
$child->buildQuery($query);
$this->_finder->baseLimited=false;
$this->runQuery($query);
foreach($child->children as $c)
$c->find();
@@ -543,7 +539,7 @@ class CJoinElement
foreach($this->children as $child)
{
if($child->relation instanceof CHasOneRelation || $child->relation instanceof CBelongsToRelation
|| $child->relation->together || ($this->_finder->joinAll && !$this->_finder->baseLimited))
|| $this->_finder->joinAll || !$this->_finder->baseLimited && $child->relation->together)
{
$child->_joined=true;
$query->join($child);

View File

@@ -1620,12 +1620,6 @@ class CActiveRelation extends CBaseActiveRelation
* For more details about this property, see {@link CActiveRecord::with()}.
*/
public $with=array();
/**
* @var boolean whether this table should be joined with the primary table. If not set or false,
* Each HAS_MANY or MANY_MANY table will appear in a separate JOIN statement. Defaults to null.
* @since 1.0.3
*/
public $together;
/**
* Merges this relation with a criteria specified dynamically.
@@ -1712,6 +1706,13 @@ class CHasManyRelation extends CActiveRelation
* @var integer offset of the rows to be selected. It is effective only for lazy loading this related object. Defaults to -1, meaning no offset.
*/
public $offset=-1;
/**
* @var boolean whether this table should be joined with the primary table.
* When setting this property to be false, the table associated with this relation will
* appear in a separate JOIN statement. Defaults to true, meaning the table will be joined
* together with the primary table. Note that in version 1.0.x, the default value of this property was false.
*/
public $together=true;
/**
* Merges this relation with a criteria specified dynamically.

View File

@@ -551,7 +551,7 @@ class CActiveRecord2Test extends CTestCase
'select'=>'title',
'condition'=>'posts.id=:id',
'limit'=>1,
'offset'=>1,
'offset'=>2,
'order'=>'posts.title',
'params'=>array(':id'=>2)));
$this->assertTrue($posts===array());

View File

@@ -527,9 +527,8 @@ class CActiveRecordTest extends CTestCase
'select'=>'title',
'condition'=>'posts.id=:id',
'limit'=>1,
'offset'=>1,
'offset'=>2,
'order'=>'posts.title',
'group'=>'posts.id',
'params'=>array(':id'=>2)));
$this->assertTrue($posts===array());
}