[WARNING] Could not transfer metadata net.minidev:json-smart/maven-metadata.xml from/to aws-glue-etl-artifacts-snapshot (s3://aws-glue-etl-artifacts-beta/snapshot): Cannot access s3://aws-glue-etl-artifacts-beta/snapshot with type default using the available connector factories: BasicRepositoryConnectorFactory
[WARNING] Could not transfer metadata commons-codec:commons-codec/maven-metadata.xml from/to aws-glue-etl-artifacts-snapshot (s3://aws-glue-etl-artifacts-beta/snapshot): Cannot access s3://aws-glue-etl-artifacts-beta/snapshot with type default using the available connector factories: BasicRepositoryConnectorFactory
19/09/02 20:34:14 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor). This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
$ ls -1 /opt/
apache-maven-3.6.0
spark-2.2.1-bin-hadoop2.7
$ echo $SPARK_HOME
/opt/spark-2.2.1-bin-hadoop2.7
$ which mvn
/opt/apache-maven-3.6.0/bin/mvn
$ mvn --version
Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T11:41:47-07:00)
Maven home: /opt/apache-maven-3.6.0
Java version: 1.8.0_201, vendor: Oracle Corporation, runtime: /usr/lib/jvm/java-8-oracle/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.15.0-58-generic", arch: "amd64", family: "unix"
$ which spark-shell
/opt/spark-2.2.1-bin-hadoop2.7/bin/spark-shell
$ git remote -v
origin [email protected]:awslabs/aws-glue-libs.git (fetch)
origin [email protected]:awslabs/aws-glue-libs.git (push)
$ git ll
* 968179f - (HEAD -> master, origin/master, origin/HEAD) Use AWSGlueETL jars to run the glue python shell/submit locally (5 days ago) <Vinay Kumar Vavili>
* 19c4d84 - Update year to 2019. (7 months ago) <Ben Sowell>
* 7e76cc9 - Update AWS Glue ETL Library to latest version (01/2019). (7 months ago) <Ben Sowell>
* 21ff9e2 - Adding standard files (1 year, 2 months ago) <Henri Yandell>
$ ./bin/gluepyspark
...
[WARNING] Could not transfer metadata net.minidev:json-smart/maven-metadata.xml from/to aws-glue-etl-artifacts-snapshot (s3://aws-glue-etl-artifacts-beta/snapshot): Cannot access s3://aws-glue-etl-artifacts-beta/snapshot with type default using the available connector factories: BasicRepositoryConnectorFactory
[WARNING] Could not transfer metadata commons-codec:commons-codec/maven-metadata.xml from/to aws-glue-etl-artifacts-snapshot (s3://aws-glue-etl-artifacts-beta/snapshot): Cannot access s3://aws-glue-etl-artifacts-beta/snapshot with type default using the available connector factories: BasicRepositoryConnectorFactory
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.911 s
[INFO] Finished at: 2019-09-02T20:34:12-07:00
[INFO] ------------------------------------------------------------------------
mkdir: cannot create directory โ/home/joe/src/jupiter/jupiter-glue/aws-glue-libs/confโ: File exists
Python 3.6.7 | packaged by conda-forge | (default, Jul 2 2019, 02:18:42)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/dlweber/src/jupiter/jupiter-glue/aws-glue-libs/jars/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark-2.2.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/09/02 20:34:14 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/09/02 20:34:14 WARN Utils: Your hostname, weber-jupiter resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface enp0s3)
19/09/02 20:34:14 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19/09/02 20:34:14 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor). This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
py4j.Gateway.invoke(Gateway.java:236)
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
py4j.GatewayConnection.run(GatewayConnection.java:214)
java.lang.Thread.run(Thread.java:748)
Traceback (most recent call last):
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/shell.py", line 45, in <module>
spark = SparkSession.builder\
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/sql/session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 334, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 180, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 273, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__
File "/opt/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoSuchMethodError: io.netty.util.ResourceLeakDetector.addExclusions(Ljava/lang/Class;[Ljava/lang/String;)V
at io.netty.buffer.AbstractByteBufAllocator.<clinit>(AbstractByteBufAllocator.java:34)
at org.apache.spark.network.util.NettyUtils.createPooledByteBufAllocator(NettyUtils.java:112)
at org.apache.spark.network.client.TransportClientFactory.<init>(TransportClientFactory.java:107)
at org.apache.spark.network.TransportContext.createClientFactory(TransportContext.java:99)
at org.apache.spark.rpc.netty.NettyRpcEnv.<init>(NettyRpcEnv.scala:70)
at org.apache.spark.rpc.netty.NettyRpcEnvFactory.create(NettyRpcEnv.scala:453)
at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:56)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:246)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:257)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:432)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/shell.py", line 54, in <module>
spark = SparkSession.builder.getOrCreate()
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/sql/session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 334, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 180, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/pyspark/context.py", line 273, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/opt/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__
File "/opt/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoClassDefFoundError: Could not initialize class io.netty.buffer.PooledByteBufAllocator
at org.apache.spark.network.util.NettyUtils.createPooledByteBufAllocator(NettyUtils.java:112)
at org.apache.spark.network.client.TransportClientFactory.<init>(TransportClientFactory.java:107)
at org.apache.spark.network.TransportContext.createClientFactory(TransportContext.java:99)
at org.apache.spark.rpc.netty.NettyRpcEnv.<init>(NettyRpcEnv.scala:70)
at org.apache.spark.rpc.netty.NettyRpcEnvFactory.create(NettyRpcEnv.scala:453)
at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:56)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:246)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:257)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:432)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
>>>