(编辑:jimmy 日期: 2024/12/26 浏览:2)
JOIN对于接触过数据库的人,这个词都不陌生,而且很多人很清楚各种JOIN,还有很多人对这个理解也不是很透彻。
假设我们有两个表,Table_A和Table_B。这两个表中的数据如下所示:
TABLE_A | TABLE_B PK Value | PK Value ---- ---------- | ---- ---------- 1 FOX | 1 TROT 2 COP | 2 CAR 3 TAXI | 3 CAB 6 WASHINGTON | 6 MONUMENT 7 DELL | 7 PC 5 ARIZONA | 8 MICROSOFT 4 LINCOLN | 9 APPLE 10 LUCENT | 11 SCOTCH
join_table: table_reference JOIN table_factor [join_condition] //内连接 | table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_reference join_condition //外连接 | table_reference LEFT SEMI JOIN table_reference join_condition //左半连接 | table_reference CROSS JOIN table_reference [join_condition] (as of Hive 0.10) table_reference: table_factor //表 | join_table //join语句 table_factor: tbl_name [alias] //表名[别名] | table_subquery alias //子查寻[别名] | ( table_references ) //带空号的table_reference join_condition: ON expression //on开头的条件语句
这是最简单、最容易理解的连接,也是最常见的连接。此查询将返回左表(表A)中具有右表(表B)中匹配记录的所有记录。此连接写成如下:
SELECT <select_list> FROM Table_A A INNER JOIN Table_B B ON A.Key = B.Key
-- Inner JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A INNER JOIN Table_B B ON A.PK = B.PK A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- 1 FOX TROT 1 2 COP CAR 2 3 TAXI CAB 3 6 WASHINGTON MONUMENT 6 7 DELL PC 7 (5 row(s) affected)
此查询将返回左表(表A)中的所有记录,而不管这些记录是否与右表(表B)中的任何记录匹配。它还将从正确的表中返回任何匹配的记录。此连接写成如下:
SELECT <select_list> FROM Table_A A LEFT JOIN Table_B B ON A.Key = B.Key
-- Left JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A LEFT JOIN Table_B B ON A.PK = B.PK A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- 1 FOX TROT 1 2 COP CAR 2 3 TAXI CAB 3 4 LINCOLN NULL NULL 5 ARIZONA NULL NULL 6 WASHINGTON MONUMENT 6 7 DELL PC 7 10 LUCENT NULL NULL (8 row(s) affected)
此查询将返回左表(表A)中与右表(表B)中的任何记录都不匹配的所有记录。此连接写成如下:
SELECT <select_list> FROM Table_A A LEFT JOIN Table_B B ON A.Key = B.Key WHERE B.Key IS NULL
-- Left Excluding JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A LEFT JOIN Table_B B ON A.PK = B.PK WHERE B.PK IS NULL A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- 4 LINCOLN NULL NULL 5 ARIZONA NULL NULL 10 LUCENT NULL NULL (3 row(s) affected)
此查询将返回右表(表B)中的所有记录,而不管这些记录中是否有任何记录与左表(表A)中的记录相匹配。它还将返回左表中的任何匹配记录。此连接写成如下:
SELECT <select_list> FROM Table_A A RIGHT JOIN Table_B B ON A.Key = B.Key
-- Right JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A RIGHT JOIN Table_B B ON A.PK = B.PK A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- 1 FOX TROT 1 2 COP CAR 2 3 TAXI CAB 3 6 WASHINGTON MONUMENT 6 7 DELL PC 7 NULL NULL MICROSOFT 8 NULL NULL APPLE 9 NULL NULL SCOTCH 11 (8 row(s) affected)
此查询将返回右表(表B)中与左表(表A)中的任何记录都不匹配的所有记录。此连接写成如下:
SELECT <select_list> FROM Table_A A RIGHT JOIN Table_B B ON A.Key = B.Key WHERE A.Key IS NULL
-- Right Excluding JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A RIGHT JOIN Table_B B ON A.PK = B.PK WHERE A.PK IS NULL A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- NULL NULL MICROSOFT 8 NULL NULL APPLE 9 NULL NULL SCOTCH 11 (3 row(s) affected)
此联接也可以称为完全外联接或完全联接。此查询将返回两个表中的所有记录,连接左表(表A)中与右表(表B)中的记录相匹配的记录。此连接写成如下:
SELECT <select_list> FROM Table_A A FULL OUTER JOIN Table_B B ON A.Key = B.Key
-- Outer JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A FULL OUTER JOIN Table_B B ON A.PK = B.PK A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- 1 FOX TROT 1 2 COP CAR 2 3 TAXI CAB 3 6 WASHINGTON MONUMENT 6 7 DELL PC 7 NULL NULL MICROSOFT 8 NULL NULL APPLE 9 NULL NULL SCOTCH 11 5 ARIZONA NULL NULL 4 LINCOLN NULL NULL 10 LUCENT NULL NULL (11 row(s) affected)
此查询将返回左表(表A)中的所有记录和右表(表B)中不匹配的所有记录。我还不需要使用这种类型的联接,但所有其他类型的联接我都相当频繁地使用。此连接写成如下:
SELECT <select_list> FROM Table_A A FULL OUTER JOIN Table_B B ON A.Key = B.Key WHERE A.Key IS NULL OR B.Key IS NULL
-- Outer Excluding JOIN SELECT A.PK AS A_PK, A.Value AS A_Value, B.Value AS B_Value, B.PK AS B_PK FROM Table_A A FULL OUTER JOIN Table_B B ON A.PK = B.PK WHERE A.PK IS NULL OR B.PK IS NULL A_PK A_Value B_Value B_PK ---- ---------- ---------- ---- NULL NULL MICROSOFT 8 NULL NULL APPLE 9 NULL NULL SCOTCH 11 5 ARIZONA NULL NULL 4 LINCOLN NULL NULL 10 LUCENT NULL NULL (6 row(s) affected)
注意,在外部联接上,首先返回内部连接记录,然后返回右连接记录,最后返回左连接记录(至少,我的Microsoft SQL Server就是这样做的;当然,这不需要使用任何ORDERBY语句)。您可以访问维基百科文章以获得更多信息(但是,条目不是图形化的)。我还创建了一个备忘单,您可以在需要时打印出来。如果您右键单击下面的图像并选择“将目标保存为.”,您将下载完整大小的图像。