Fork me on GitHub

如何将文件数据导入hive表中

目录

  • 背景
  • 第一部分 环境依赖
  • 第二部分 交互接口
  • 第三部分 任务提交
  • 参考文献及资料

背景

使用head命令看一下。

第一步 创建库和hive表

使用下面的命令进入hive shell交互模式。

1
root@hadoop01:/opt/hive/bin/#hive

创建库:

1
CREATE database cda;

创建表:

1
2
3
4
5
6
7
8
9
10
11
12
13
CREATE TABLE IF NOT EXISTS cda.users (
user_id string,
item_id string,
cat_id string,
merchant_id string,
brand_id string,
month string,
day string,
action string,
age_range string,
gender string,
province string
)COMMENT 'user_log.csv Table' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' tblproperties("skip.header.line.count"="1");

如果需要分区:

1
2
3
4
5
6
7
8
9
10
11
CREATE TABLE IF NOT EXISTS cda.users (
user_id string,
item_id string,
cat_id string,
merchant_id string,
brand_id string,
action string,
age_range string,
gender string,
province string
)PARTITIONED BY(month string,day string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' tblproperties("skip.header.line.count"="1");

第二部分 准备数据和导入

将数据上传hdfs:

1
root@hadoop03:/opt/data# hdfs dfs -put user_log.csv /data/user_log.csv

导入数据:

1
LOAD DATA INPATH 'hdfs:///data/user_log.csv' INTO TABLE cda.users;

第三部分 查询

1
2
3
4
5
6
7
8
9
10
11
root@hadoop01:/opt/hive/bin# hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-1.4.13/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/hive/lib/hive-common-2.3.7.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> select * from cda.users limit 10;

回显数据,完成导入。

参考文献及资料

1、 hive-load-csv-file-into-table,链接:https://sparkbyexamples.com/apache-hive/hive-load-csv-file-into-table/

0%