scala - unable to view data of hive tables after update in spark -


case: have table hivetest orc table , transaction set true , loaded in spark shell , viewed data

var rdd= objhivecontext.sql("select * hivetest") rdd.show() 

--- able view data

now went hive shell or ambari updated table , example

hive> update hivetest set name='test'   ---done , success hive> select * hivetest -- able view updated data 

now when can come spark , run cannot view data except column names

scala>var rdd1= objhivecontext.sql("select * hivetest") scala> rdd1.show() 

--this time columns printed , data not coming

issue 2: unable update spark sql when run scal>objhivecontext.sql("update hivetest set name='test'") getting below error

org.apache.spark.sql.analysisexception: unsupported language features in query: insert hivetest values(1,'sudhir','software',1,'it') tok_query 0, 0,17, 0   tok_from 0, -1,17, 0     tok_virtual_table 0, -1,17, 0       tok_virtual_tabref 0, -1,-1, 0         tok_anonymous 0, -1,-1, 0       tok_values_table 1, 6,17, 28         tok_value_row 1, 7,17, 28           1 1, 8,8, 28           'sudhir' 1, 10,10, 30           'software' 1, 12,12, 39           1 1, 14,14, 50           'it' 1, 16,16, 52   tok_insert 1, 0,-1, 12     tok_insert_into 1, 0,4, 12       tok_tab 1, 4,4, 12         tok_tabname 1, 4,4, 12           hivetest 1, 4,4, 12     tok_select 0, -1,-1, 0       tok_selexpr 0, -1,-1, 0         tok_allcolref 0, -1,-1, 0  scala.notimplementederror: no parse rules for:  tok_virtual_table 0, -1,17, 0   tok_virtual_tabref 0, -1,-1, 0     tok_anonymous 0, -1,-1, 0   tok_values_table 1, 6,17, 28     tok_value_row 1, 7,17, 28       1 1, 8,8, 28       'sudhir' 1, 10,10, 30       'software' 1, 12,12, 39       1 1, 14,14, 50       'it' 1, 16,16, 52  org.apache.spark.sql.hive.hiveql$.nodetorelation(hiveql.scala:1235) 

this error insert statement same sort of error update statement also.

have tried objhivecontext.refreshtable("hivetest")?

spark sql aggressively caches hive metastore data.

if update happens outside of spark sql, might experience unexpected results spark sql's version of hive metastore out of date.

here's more info:

http://spark.apache.org/docs/latest/sql-programming-guide.html#metadata-refreshing

http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.hive.hivecontext

the docs mention parquet, applies orc , other file formats.

with json, example, if add new files directory outside of spark sql, you'll need call hivecontext.refreshtable() within spark sql see new data.


Comments

Popular posts from this blog

ruby - Trying to change last to "x"s to 23 -

jquery - Clone last and append item to closest class -

css - Can I use the :after pseudo-element on an input field? -