Yarn上长任务报Token失效问题总结(Invalid AMRMToken)

背景

\4. hive 从非分区表插入数据到分区表时出错：

Cannot insert into target table because column number/types are different ‘’分区名’’: Table insclause-0 has 2 columns, but query has 3 columns.

首先解释一下这个错误：因为hive的分区列是作为元信息存放在mysql中的，他们并不在数据文件中，相反他们以子目录的名字被使用，因此你的分区表实际含有的数据列，注意是数据列是不包含分区列的，所以在你向分区表插入数据时，不能插入分区列；

举个简单的例子：体育课上站队时，老师经常让男生、女生各站一队，你觉得有必要再给他们每一个人加上一个性别的标签吗？

下面是stackoverflow上大神关于这个问题的解释：

*\*In Hive the partitioning “columns” are managed as** **metadata** **>> they are not included in the data files, instead they are used as sub-directory names. So your partitioned table has just 2 real columns, and you must feed just 2 columns with your SELECT.****

比如你的非分区表non_part的内容如下：

1
2
3

id	name	sex
1	tom 	M
2	mary	F

假如你的分区表part是以sex来分区的，当你想把以上非分区表中的数据插入到分区表中：

你应该：insert into table part partition(sex=’M’) select id,name from non_part where sex=’M’;（对，你要插入的就是两列，分区列不作为数据保存在数据表中）

\而不是\*：****insert into table part partition(sex=’M’) select \ from non_part where sex=’M’;****

另外需要注意的地方：

1.从非分区表插入数据到分区表时hive会将HiveQL转换为MR来执行的,官方提示Hive-on-MR在将来可能不再被支持；

2.只要是向分区表内装数据，无论是load还是insert都要在表名后指明分区名；而且load时，会将你要load的文件内的所有内容放在指定的分区下；

https://www.cnblogs.com/lemonu/p/11279979.html

参考文献及资料

1、 Apache Spark support，链接