keedio / buildoop Goto Github PK
View Code? Open in Web Editor NEWHadoop Ecosystem Builder: Build, package, test and deploy your Hadoop ecosystem project.
License: Apache License 2.0
Hadoop Ecosystem Builder: Build, package, test and deploy your Hadoop ecosystem project.
License: Apache License 2.0
Zookeeper package does not include Zookeeper REST Server.
It would be fine to include it in order to access to its functionality from HUE.
Since HUE 3 there is an enhanced Zookeeper app that needs the Zookeeper REST server to be running (by default in the same server than Zookeepet itself) and listening at 9998 port (by default) to access the Znodes hierarchy.
From a complete source package, after you build it, you can start the REST service as follows:
cd src/contrib/rest
nohup ant run&
I've tested it manually with Zookeeper 3.4.5 and Hue 3.5.0 versions.
Zookeeper REST Service --> https://github.com/apache/zookeeper/tree/trunk/src/contrib/rest
HUE 3 announcement and setup instructions --> http://gethue.tumblr.com/post/67482299450/new-zookeeper-browser-app
In hadoop-hdfs-datanode init script line 103 is
start_daemon -u $SVC_USER $EXEC_PATH --config "$CONF_DIR" stop $DAEMON_FLAGS
but should be
if [ -n "$HADOOP_SECURE_DN_USER" ]; then
TARGET_USER=root
else
TARGET_USER=${HADOOP_DATANODE_USER:-hdfs}
fi
start_daemon -u $TARGET_USER $EXEC_PATH --config "$CONF_DIR" stop $DAEMON_FLAGS
The if statement control if is a secured cluster, if so it uses jsvc which is launched by root not hdfs user
The SysV init script /etc/init.d/hadoop-hdfs-namenode uses
/usr/lib/bigtop-utils/bigtop-detect-javahome for check JAVA_HOME enviroment variable.
This must be changed to something more standard behaviour. For example:
$ cat /etc/profile.d/java.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
And check directly the JAVA_HOME variable in the initscript.
$ cat /etc/profile.d/java.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51/
Better set up JAVA_HOME with classical configuration than non standard use of bigtop utils.
This patch is pending of summit.
nice: /usr/lib/spark/spark-class: No such file or directory
The storm package is a key piece in the OpenBus ecosystem, together with Spark Streaming and Batch.
2014-03-01 09:55:04,225 ERROR org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Error while trying to move a job to done
org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=READ, inode="/user/history/done_intermediate/hdfs/job_1393662509265_0003.summary":hdfs:supergroup:-rwxrwx---
$ sudo -E hadoop-fuse-dfs dfs://doop-manager.buildoop.org:8020 /mnt/hdfs-mount/
INFO /home/jroman/HACK/buildoop.git/build/work/hadoop-2.2.0_openbus-0.0.1-r1/rpmbuild/BUILD/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:164 Adding FUSE arg /mnt/hdfs-mount/
$ ls /mnt/hdfs-mount
ls: cannot access /mnt/hdfs-mount: Input/output error
When try to start hbase it fails with:
java.lang.ClassNotFoundException: org.apache.hadoop.metrics2.lib.MetricMutable
which is a hadoop1.2.1 library:
http://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/metrics2/lib/MetricMutable.html
it's missing in hadoop2.2.0:
http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/metrics2/lib/package-frame.html
It looks that this class has been renamed to this one:
http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/metrics2/lib/MutableMetric.html
So a rebuild of package is required
To create a bash script or groovy script in order to check the properly setting up of local filesystem permissions and HDFS, permissions.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hadoop-auth: Compilation failure: Compilation failure:
[ERROR] /home/jroman/HACK/temp/hadoop-2.2.0-src-vanilla/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[86,13] cannot access org.mortbay.component.AbstractLifeCycle
[ERROR] class file for org.mortbay.component.AbstractLifeCycle not found
[ERROR] server = new Server(0);
[ERROR] /home/jroman/HACK/temp/hadoop-2.2.0-src-vanilla/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[96,29] cannot access org.mortbay.component.LifeCycle
[ERROR] class file for org.mortbay.component.LifeCycle not found
[ERROR] server.getConnectors()[0].setHost(host);
[ERROR] /home/jroman/HACK/temp/hadoop-2.2.0-src-vanilla/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[98,10] cannot find symbol
[ERROR] symbol : method start()
[ERROR] location: class org.mortbay.jetty.Server
Buildoop cli must to abort when the rpmbuild tool fails his compilation. The cli have to show a error message to user.
The DataNode need JSVC in order to run the DN daemon in privileged socket ports. The DN package must be depends on standard packages:
jakarta-commons-daemon-1.0.1-8.9.el6.x86_64
jakarta-commons-daemon-jsvc-1.0.1-8.9.el6.x86_64
However this version has bugs running with hadoop. This is why Bigtop is using his own package: https://issues.apache.org/jira/browse/BIGTOP-389
We have to do the same.
Services shouldn't be set to on in chkconfig, services should be started by monit.
Based on:
Create similar mechanism to populate the initial HDFS layout. Probably this task must be delegated to Puppet.
Symbolync link
/usr/lib/flume/lib/hadoop-auth.jar --> /usr/lib/hadoop/hadoop-auth.jar should point to /usr/lib/hadoop/hadoop-auth-2.2.0.jar
usr/lib/flume/lib/hadoop-common.jar-->/usr/lib/hadoop/hadoop-common.jar to usr/lib/hadoop/hadoop-common-2.2.0.jar
Apache Phoenix is a SQL skin over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. That's means accessing HBase data through a well-defined, industry standard API.
Create package for HBASE_VERSION=0.96.1.1_openbus-0.0.1-r1
Missing dependency with jdk in kafka package.
Create HIVE package
In start daemon script, installation command has hardcoded user and group, it should refer to previously defined variable $SVC_USER, and maybe declare a new variable $SVC_GROUP
CentOS/Red Hat Enterprise Linux 5.6 or later uses AES-256 encryption by default for tickets, you must install the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File on all cluster and Hadoop user machines. This is a ZIP file from Oracle, and it's not package for any distro.
We have to use other encryption by default to avoid this extra-deployment.
The building of this main component for Quorum Journal Manager in HDFS HA is missing from hadoop-2.2.0 in the BOM openbus-0.0.1
This fail is due to Hortonworks hadoop.spec file is not building this service, and the hadoop.spec for openbus-0.0.1 is using as base this SPEC file from Hortonworks.
The problem is fixed with:
rpm-build-stage:
...
-Dhadoop.version=$HADOOP_VERSION
...
Top pom.xml:
com.google.protobuf
Create package for HUE_VERSION=3.5.0_openbus-0.0.1-r1
Sqoop service has missing export for CATALINA_PID, so whenever TOMCAT starts doesn't create any pidfile and are not able to detect that it has started propertly neither stop the service or show correct status
Pig package creation
The output of oozie package has a high verbose output, this raises an dead lock due to the internal output pipe is full. This can be fiixed using the method consumeProcessOutput in the execute function.
Flume run as flume user, which forbid read permission to some log files. Flume should require to run as root user to have unrestricted access to all log files.
Possibles workarounds:
hadoop-auth.jar -> /usr/lib/hadoop/hadoop-auth.jar
hadoop-common.jar -> /usr/lib/hadoop/hadoop-common.jar
The generic "hadoop-auth.jar" and "hadoop-common.jar" must be exits in:
ln -s /usr/lib/hadoop/hadoop-auth-2.2.0.jar /usr/lib/hadoop/hadoop-auth.jar
ln -s /usr/lib/hadoop/hadoop-common-2.2.0.jar /usr/lib/hadoop/hadoop-common.jar
Those links must be created by hadoop-2.2.0-openbus0.0.1_1 package.
The testing framework is based in Vagrant using the default providers (VirtualBox), however probably is much better to use a more lightweight providers as LXC - LinuX Containers.
Create package for OOZIE_VERSION=4.0.0_openbus-0.0.1-r1
sudo -E /etc/init.d/hadoop-hdfs-namenode stop
Stopping Hadoop namenode: [ OK ]
The daemon non stop with this command. The PID file when is executed the stop() function is not correct.
Install phase error:
jzmqstorm package is a dependency of #44
Set a minimal multi-machine vagrant environmet based on VirtualBox providers.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.