HiveQL Select Join

广告位

JOIN是子句用于通过使用共同值组合来自两个表特定字段。它是用来从数据库中的两个或更多的表组合的记录。它或多或…

JOIN是子句用于通过使用共同值组合来自两个表特定字段。它是用来从数据库中的两个或更多的表组合的记录。它或多或少类似于SQL JOIN。

语法

  join_table:       table_reference JOIN table_factor [join_condition]     | table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_reference     join_condition     | table_reference LEFT SEMI JOIN table_reference join_condition     | table_reference CROSS JOIN table_reference [join_condition]

示例

我们在本章中将使用下面的两个表。考虑下面的表CUSTOMERS..

  +----+----------+-----+-----------+----------+   | ID | NAME     | AGE | ADDRESS   | SALARY   |   +----+----------+-----+-----------+----------+   | 1  | Ramesh   | 32  | Ahmedabad | 2000.00  |    | 2  | Khilan   | 25  | Delhi     | 1500.00  |    | 3  | kaushik  | 23  | Kota      | 2000.00  |   | 4  | Chaitali | 25  | Mumbai    | 6500.00  |   | 5  | Hardik   | 27  | Bhopal    | 8500.00  |   | 6  | Komal    | 22  | MP        | 4500.00  |   | 7  | Muffy    | 24  | Indore    | 10000.00 |   +----+----------+-----+-----------+----------+

考虑另一个表命令如下:

  +-----+---------------------+-------------+--------+   |OID  | DATE                | CUSTOMER_ID | AMOUNT |   +-----+---------------------+-------------+--------+   | 102 | 2009-10-08 00:00:00 |           3 | 3000   |   | 100 | 2009-10-08 00:00:00 |           3 | 1500   |   | 101 | 2009-11-20 00:00:00 |           2 | 1560   |   | 103 | 2008-05-20 00:00:00 |           4 | 2060   |   +-----+---------------------+-------------+--------+

有不同类型的联接给出如下:

  • JOIN
  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN

JOIN

JOIN子句用于合并和检索来自多个表中的记录。 JOIN和SQLOUTER JOIN 类似。连接条件是使用主键和表的外键。

下面的查询执行JOIN的CUSTOMER和ORDER表,并检索记录:

  hive> SELECT c.ID, c.NAME, c.AGE, o.AMOUNT      > FROM CUSTOMERS c JOIN ORDERS o      > ON (c.ID = o.CUSTOMER_ID);

成功执行查询后,能看到以下回应:

  +----+----------+-----+--------+   | ID | NAME     | AGE | AMOUNT |   +----+----------+-----+--------+   | 3  | kaushik  | 23  | 3000   |   | 3  | kaushik  | 23  | 1500   |   | 2  | Khilan   | 25  | 1560   |   | 4  | Chaitali | 25  | 2060   |   +----+----------+-----+--------+

LEFT OUTER JOIN

HiveQL LEFT OUTER JOIN返回所有行左表,即使是在正确的表中没有匹配。这意味着,如果ON子句匹配的右表0(零)记录,JOIN还是返回结果行,但在右表中的每一列为NULL。

LEFT JOIN返回左表中的所有的值,加上右表,或JOIN子句没有匹配的情况下返回NULL。

下面的查询演示了CUSTOMER 和ORDER 表之间的LEFT OUTER JOIN用法:

  hive> SELECT c.ID, c.NAME, o.AMOUNT, o.DATE      > FROM CUSTOMERS c      > LEFT OUTER JOIN ORDERS o      > ON (c.ID = o.CUSTOMER_ID);

成功执行查询后,能看到以下回应:

  +----+----------+--------+---------------------+   | ID | NAME     | AMOUNT | DATE                |   +----+----------+--------+---------------------+   | 1  | Ramesh   | NULL   | NULL                |   | 2  | Khilan   | 1560   | 2009-11-20 00:00:00 |   | 3  | kaushik  | 3000   | 2009-10-08 00:00:00 |   | 3  | kaushik  | 1500   | 2009-10-08 00:00:00 |   | 4  | Chaitali | 2060   | 2008-05-20 00:00:00 |   | 5  | Hardik   | NULL   | NULL                |   | 6  | Komal    | NULL   | NULL                |   | 7  | Muffy    | NULL   | NULL                |   +----+----------+--------+---------------------+

RIGHT OUTER JOIN

HiveQL RIGHT OUTER JOIN返回右边表的所有行,即使有在左表中没有匹配。如果ON子句的左表匹配0(零)的记录,JOIN结果返回一行,但在左表中的每一列为NULL。

RIGHT JOIN返回右表中的所有值,加上左表,或者没有匹配的情况下返回NULL。

下面的查询演示了在CUSTOMER和ORDER表之间使用RIGHT OUTER JOIN。

  hive> SELECT c.ID, c.NAME, o.AMOUNT, o.DATE      > FROM CUSTOMERS c      > RIGHT OUTER JOIN ORDERS o      > ON (c.ID = o.CUSTOMER_ID);

成功执行查询后,能看到以下回应:

  +------+----------+--------+---------------------+   | ID   | NAME     | AMOUNT | DATE                |   +------+----------+--------+---------------------+   | 3    | kaushik  | 3000   | 2009-10-08 00:00:00 |   | 3    | kaushik  | 1500   | 2009-10-08 00:00:00 |   | 2    | Khilan   | 1560   | 2009-11-20 00:00:00 |   | 4    | Chaitali | 2060   | 2008-05-20 00:00:00 |   +------+----------+--------+---------------------+

FULL OUTER JOIN

HiveQL FULL OUTER JOIN结合了左边,并且满足JOIN条件合适外部表的记录。连接表包含两个表的所有记录,或两侧缺少匹配结果那么使用NULL值填补

下面的查询演示了CUSTOMER 和ORDER 表之间使用的FULL OUTER JOIN:

  hive> SELECT c.ID, c.NAME, o.AMOUNT, o.DATE      > FROM CUSTOMERS c      > FULL OUTER JOIN ORDERS o      > ON (c.ID = o.CUSTOMER_ID);

成功执行查询后,能看到以下回应:

  +------+----------+--------+---------------------+   | ID   | NAME     | AMOUNT | DATE                |   +------+----------+--------+---------------------+   | 1    | Ramesh   | NULL   | NULL                |   | 2    | Khilan   | 1560   | 2009-11-20 00:00:00 |   | 3    | kaushik  | 3000   | 2009-10-08 00:00:00 |   | 3    | kaushik  | 1500   | 2009-10-08 00:00:00 |   | 4    | Chaitali | 2060   | 2008-05-20 00:00:00 |   | 5    | Hardik   | NULL   | NULL                |   | 6    | Komal    | NULL   | NULL                |  | 7    | Muffy    | NULL   | NULL                |    | 3    | kaushik  | 3000   | 2009-10-08 00:00:00 |   | 3    | kaushik  | 1500   | 2009-10-08 00:00:00 |   | 2    | Khilan   | 1560   | 2009-11-20 00:00:00 |   | 4    | Chaitali | 2060   | 2008-05-20 00:00:00 |   +------+----------+--------+---------------------+

关于作者: 大数据教程学习

为您推荐