Test your Hive scripts inside your favorite IDE with HiveQLUnit! Increase your developers productivity by testing on all operating systems including Windows, Linux and Mac OSX. Build continuous integration and delivery tests to control the releases of your big data products.
When i run the unit test i got this error java.lang.NoSuchMethodError: org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(Lorg/apache/hadoop/hive/conf/HiveConf;Lorg/apache/hadoop/hive/metastore/HiveMetaHookLoader;Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/IMetaStoreClient;
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2661)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2680)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:425)
at org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:194)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208)
at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462)
at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461)
at org.apache.spark.sql.UDFRegistration.(UDFRegistration.scala:40)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:330)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:101)
at org.finra.hiveqlunit.rules.TestHiveServer$ConstructHiveContextStatement.evaluate(TestHiveServer.java:93)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Hello, getting the following error. I'm on windows 7, i tried eclipse, intellij and cmd. all the same error.
java.lang.AbstractMethodError: org.apache.spark.sql.hive.HiveContext$$anon$3.conf()Lorg/apache/spark/sql/catalyst/CatalystConf;
at org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$class.$init$(FunctionRegistry.scala:37)
at org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:267)
at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:267)
at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:266)
at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:49)
at org.apache.spark.sql.UDFRegistration.(UDFRegistration.scala:41)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:266)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:49)
at org.finra.hiveqlunit.rules.TestHiveServer$ConstructHiveContextStatement.evaluate(TestHiveServer.java:93)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:292)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
There is exception printed out in the console (log) and directory spark- is left
after JVM complete normal shutdown in java.io.tmpdir
(strictly speaking it can be other directory, and it can be configured through the property, so
it should be taken in the account)
As far I investigate the root cause is https://issues.apache.org/jira/browse/SPARK-8333 that is there is open bug in Spark (my system is Windows 10 64 bit).
As a work-arround what I did I;
switch off from the log this message
<logger name="org.apache.spark.util.Utils" level="off" />
Right some rule that clean up all spark* folder from java.io.tmpdir (or more precise position) on the JVM startup
Spark and Hive are currently producing a lot of log output which makes rooting out actual issues harder. The logging level needs to be restricted to WARN or some such.
The SubstituteVariableResource substitutes values for variables in wrapped TextResources (${variable} -> value), but only does one variable per wrapper object. A more practical means is needed for substituting large numbers of variables.
I've been trying out HiveQLUnit and I got a test that sets up some tables and data. During this setup I get this exception:
java.lang.IllegalArgumentException: Setting negative mapred.reduce.tasks for automatically determining the number of reducers is not supported. at org.apache.spark.sql.execution.SetCommand.run(commands.scala:92) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:68) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:87) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:950) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:950) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:144) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:128) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:755)
I've tried setting the property to a positive value hqlContext.setConf("mapred.reduce.tasks", "2"); but it not being picked up. I've debugged and the hqlContext is the same object where I set the value as where it is being checked. Not really sure what is going on, any ideas? What's the best way to override this value?
The Javadoc comments compile correctly into Javadocs, but the text is not arranged into sentences and paragraphs well. The final product is rather messy looking.