
背景:在升级社区的hive 1.2.1过程中发现,如果使用子查询嵌套,里外别名相同时,会出现结果错误的情况。
如:
select
s.*
from
(
select
1000 as qq_no
)s join
(
select
a.qq_no
from
(
select qq_no from t_om_qqbook_ykklcy_qqno where ds=201510 and qq_no>0
)a left outer join
(
select qq_no from t_mm_qqbook_b1_blacklist where ds=201510
)b on a.qq_no=b.qq_no
where b.qq_no is null
)b on s.qq_no=b.qq_no
limit 10;
定位问题:修改外层别名,结果正确;
select
s.*
from
(
select
1000 as qq_no
)s join
(
select
a.qq_no
from
(
select qq_no from t_om_qqbook_ykklcy_qqno where ds=201510 and qq_no>0
)a left outer join
(
select qq_no from t_mm_qqbook_b1_blacklist where ds=201510
)b on a.qq_no=b.qq_no
where b.qq_no is null
)t on s.qq_no=t.qq_no
limit 10;
且修改内存列别名结果依然正确
select
s.*
from
(
select
1000 as qq_no
)s join
(
select
a.qq_no as qq_nox
from
(
select qq_no from t_om_qqbook_ykklcy_qqno where ds=201510 and qq_no>0
)a left outer join
(
select qq_no from t_mm_qqbook_b1_blacklist where ds=201510
)b on a.qq_no=b.qq_no
where b.qq_no is null
)b on s.qq_no=b.qq_nox
limit 10;
跟踪explain extended 解析步骤发现
正常结果的步骤里有predicate
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
alias: t_om_qqbook_ykklcy_qqno
Statistics: Num rows: 1611465 Data size: 15052568 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: ((qq_no > 0) and qq_no is not null) (type: boolean)
Statistics: Num rows: 268578 Data size: 2508766 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: qq_no (type: bigint)
outputColumnNames: _col0
Statistics: Num rows: 268578 Data size: 2508766 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 268578 Data size: 2508766 Basic stats: COMPLETE Column stats: NONE
tag: 0
auto parallelism: false
TableScan
alias: t_mm_qqbook_b1_blacklist
Statistics: Num rows: 3311453 Data size: 33021350 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: qq_no is not null (type: boolean)
Statistics: Num rows: 1655727 Data size: 16510679 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: qq_no (type: bigint)
outputColumnNames: _col0
Statistics: Num rows: 1655727 Data size: 16510679 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 1655727 Data size: 16510679 Basic stats: COMPLETE Column stats: NONE
tag: 1
auto parallelism: false
Path -> Alias:
hdfs://10.179.18.50:9000/user/hive/warehouse/u_wsd.db/t_mm_qqbook_b1_blacklist/ds=201510 [t:b:t_mm_qqbook_b1_blacklist]
hdfs://10.179.18.50:9000/user/hive/warehouse/u_wsd.db/t_om_qqbook_ykklcy_qqno/ds=201510 [t:a:t_om_qqbook_
错误结果步骤里没有
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
alias: t_om_qqbook_ykklcy_qqno
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: false (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: qq_no (type: bigint)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
tag: 0
auto parallelism: false
TableScan
alias: t_mm_qqbook_b1_blacklist
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: false (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: qq_no (type: bigint)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
tag: 1
auto parallelism: false
断定是predicate过程发生错误。
跟踪日志:
打开调试日志

发现正常的不同别名日志里没有替换b.qq_no:

同别名的有发生替换:

所有在后续的替换中发生不同转换:
不同别名:

同别名:

再定位到相应代码,找到原因:
发生的原因是当嵌套的子查询别名一样时,而且输出的列名一样时,当里层的子查询产生Constant predicate替换时

外层的子查询在查找相应的表+列时能找到里层子查询的表+列;
可但是,外层的b.qq_no并不是里层的b.qq_no生成的,而是a.qq_no生成的,所以需要检查以下columnExprMap,
且需要确认如果里层子查询是select 1001 as qq_no ...时同样需要检测constant的值是否一样。
总之:是由于外面的子查询别名在比对上层的Filter Operater的opToConstantExprs时发现了相同的表+列;但实际上外层取的列已经是一个新的ExprNode

修改代码

打完收工。