scala - unable to view data of hive tables after update in spark -

- September 15, 2010

case: have table hivetest orc table , transaction set true , loaded in spark shell , viewed data

var rdd= objhivecontext.sql("select * hivetest") rdd.show()

--- able view data

now went hive shell or ambari updated table , example

hive> update hivetest set name='test'   ---done , success hive> select * hivetest -- able view updated data

now when can come spark , run cannot view data except column names

scala>var rdd1= objhivecontext.sql("select * hivetest") scala> rdd1.show()

--this time columns printed , data not coming

issue 2: unable update spark sql when run scal>objhivecontext.sql("update hivetest set name='test'") getting below error

org.apache.spark.sql.analysisexception: unsupported language features in query: insert hivetest values(1,'sudhir','software',1,'it') tok_query 0, 0,17, 0   tok_from 0, -1,17, 0     tok_virtual_table 0, -1,17, 0       tok_virtual_tabref 0, -1,-1, 0         tok_anonymous 0, -1,-1, 0       tok_values_table 1, 6,17, 28         tok_value_row 1, 7,17, 28           1 1, 8,8, 28           'sudhir' 1, 10,10, 30           'software' 1, 12,12, 39           1 1, 14,14, 50           'it' 1, 16,16, 52   tok_insert 1, 0,-1, 12     tok_insert_into 1, 0,4, 12       tok_tab 1, 4,4, 12         tok_tabname 1, 4,4, 12           hivetest 1, 4,4, 12     tok_select 0, -1,-1, 0       tok_selexpr 0, -1,-1, 0         tok_allcolref 0, -1,-1, 0  scala.notimplementederror: no parse rules for:  tok_virtual_table 0, -1,17, 0   tok_virtual_tabref 0, -1,-1, 0     tok_anonymous 0, -1,-1, 0   tok_values_table 1, 6,17, 28     tok_value_row 1, 7,17, 28       1 1, 8,8, 28       'sudhir' 1, 10,10, 30       'software' 1, 12,12, 39       1 1, 14,14, 50       'it' 1, 16,16, 52  org.apache.spark.sql.hive.hiveql$.nodetorelation(hiveql.scala:1235)

this error insert statement same sort of error update statement also.

have tried objhivecontext.refreshtable("hivetest")?

spark sql aggressively caches hive metastore data.

if update happens outside of spark sql, might experience unexpected results spark sql's version of hive metastore out of date.

here's more info:

http://spark.apache.org/docs/latest/sql-programming-guide.html#metadata-refreshing

http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.hive.hivecontext

the docs mention parquet, applies orc , other file formats.

with json, example, if add new files directory outside of spark sql, you'll need call hivecontext.refreshtable() within spark sql see new data.

Search This Blog

Stadnd

scala - unable to view data of hive tables after update in spark -

Comments

Post a Comment

Popular posts from this blog

Capture and play voice with Asterisk ARI -

visual studio - Installing Packages through Nuget - "Central Directory corrupt" -

python - Statsmodels.api Logit model error ValueError: endog must be in the unit interval -