sergevs / ansible-cloudera-hadoop Goto Github PK
View Code? Open in Web Editor NEWansible playbook to deploy cloudera hadoop components to the cluster
License: MIT License
ansible playbook to deploy cloudera hadoop components to the cluster
License: MIT License
TASK [common : install packages]
error: (item=[u'java-1.8.0', u'bigtop-utils']) => {"changed": false, "failed": true, "invocation": {"module_args": {"conf_file": null, "disable_gpg_check": false, "disablerepo": null, "enablerepo": null, "exclude": null, "install_repoquery": true, "list": null, "name": ["java-1.8.0", "bigtop-utils"], "state": "latest", "update_cache": false, "validate_certs": true}, "module_name": "yum"}, "item": ["java-1.8.0", "bigtop-utils"], "msg": "No Package matching 'bigtop-utils' found available, installed or updated", "rc": 0, "results": ["All packages providing java-1.8.0 are up to date"]}
Good afternoon, I encountered the fact that hadoop-yarn-nodemanager daemons do not start on datanodes.
All installation steps up to this step are fine, but here is the problem with this particular daemon.
How can this problem be solved?
The environment on which I run the playbook:
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
Ansible version:
ansible --version
ansible 2.10.8
config file = None
configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
Servers environment:
cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)
hdfs version which installed the playbook:
Hadoop 3.3.5
Source code repository https://github.com/apache/bigtop.git -r 4a34226ec01a894fd96cc00c052d96e61673c60e
Compiled by jenkins on 2023-08-02T05:53Z
Compiled with protoc 3.7.1
Launch playbook:
ansible-playbook -i inventory/yandex/dev/hosts-hadoop-cloudera playbooks/hadoop-cloudera.yml
Error ansible task:
TASK [hadoop : start services] ************************************************************************************************************************************************************************************
changed: [hadoop-datanode06-dev] => (item=hadoop-hdfs-datanode)
changed: [hadoop-datanode04-dev] => (item=hadoop-hdfs-datanode)
changed: [hadoop-datanode05-dev] => (item=hadoop-hdfs-datanode)
failed: [hadoop-datanode05-dev] (item=hadoop-yarn-nodemanager) => {"ansible_loop_var": "item", "changed": false, "item": "hadoop-yarn-nodemanager", "msg": "Unable to start service hadoop-yarn-nodemanager: Job for hadoop-yarn-nodemanager.service failed because the control process exited with error code. See \"systemctl status hadoop-yarn-nodemanager.service\" and \"journalctl -xe\" for details.\n"}
failed: [hadoop-datanode06-dev] (item=hadoop-yarn-nodemanager) => {"ansible_loop_var": "item", "changed": false, "item": "hadoop-yarn-nodemanager", "msg": "Unable to start service hadoop-yarn-nodemanager: Job for hadoop-yarn-nodemanager.service failed because the control process exited with error code. See \"systemctl status hadoop-yarn-nodemanager.service\" and \"journalctl -xe\" for details.\n"}
failed: [hadoop-datanode04-dev] (item=hadoop-yarn-nodemanager) => {"ansible_loop_var": "item", "changed": false, "item": "hadoop-yarn-nodemanager", "msg": "Unable to start service hadoop-yarn-nodemanager: Job for hadoop-yarn-nodemanager.service failed because the control process exited with error code. See \"systemctl status hadoop-yarn-nodemanager.service\" and \"journalctl -xe\" for details.\n"}
First error on data nodes:
23/08/31 06:18:08 ERROR containermanager.AuxServices: Failed to initialize mapreduce_shuffle
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not fo
und
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2726)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.createAuxServiceFromConfiguration(AuxServices.java:204)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.createAuxService(AuxServices.java:297)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.initAuxService(AuxServices.java:452)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:758)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:327)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:494)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:962)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1042)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2693)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2718)
... 13 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2597)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2691)
... 14 more
Second error on datanodes:
23/08/31 06:18:08 ERROR nodemanager.NodeManager: Error starting NodeManager
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2726)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.createAuxServiceFromConfiguration(AuxServices.java:204)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.createAuxService(AuxServices.java:297)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.initAuxService(AuxServices.java:452)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:758)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:327)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:494)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:962)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1042)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2693)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2718)
... 13 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2597)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2691)
... 14 more
Hi, I got the following error while trying to install to a Centos 7 machine:
TASK [common : install packages] ***********************************************
failed: [10.206.46.222] (item=[u'java-1.8.0', u'bigtop-utils']) => {"changed": false, "failed": true, "item": ["java-1.8.0", "bigtop-utils"], "msg": "No Package matching 'bigtop-utils' found available, installed or updated", "rc": 0, "results": []}
Hi ,
I am running into issues when hdfs dfs when installing hive on a node. Any help is appreciated.
Here is my hosts file info.
[namenodes]
master-01 ansible_user=vagrant
master-02 ansible_user=vagrant
[datanodes]
data-01 ansible_user=vagrant
data-02 ansible_user=vagrant
data-03 ansible_user=vagrant
[yarnresourcemanager]
master-01 ansible_user=vagrant
master-02 ansible_user=vagrant
[zookeepernodes]
master-01 ansible_user=vagrant
master-02 ansible_user=vagrant
utility-01 ansible_user=vagrant
[journalnodes]
master-01 ansible_user=vagrant
master-02 ansible_user=vagrant
utility-01 ansible_user=vagrant
[postgresql]
utility-01 ansible_user=vagrant
[hivemetastore]
utility-01 ansible_user=vagrant
[impala-store-catalog]
utility-01 ansible_user=vagrant
[hbasemaster]
[solr]
#optional
[spark]
[oozie]
utility-01 ansible_user=vagrant
[kafka]
[hue]
edge-01 ansible_user=vagrant
#[dashboard]
[dashboard:children]
namenodes
[hadoop:children]
namenodes
datanodes
journalnodes
yarnresourcemanager
hivemetastore
impala-store-catalog
hbasemaster
solr
spark
oozie
hue
[java:children]
hadoop
kafka
zookeepernodes
TASK [hivemetastore : copy hive-site.xml to hdfs] *****************************************************************************************
changed: [utility-01] => (item=-mkdir -p /etc/hive/conf)
failed: [utility-01] (item=-copyFromLocal -f /etc/cluster/hive/hive-site.xml /etc/hive/conf) => {"changed": true, "cmd": ["sudo", "-u", "hdfs", "hdfs", "dfs", "-copyFromLocal", "-f", "/etc/cluster/hive/hive-site.xml", "/etc/hive/conf"], "delta": "0:00:04.134512", "end": "2019-03-09 10:53:27.748157", "item": "-copyFromLocal -f /etc/cluster/hive/hive-site.xml /etc/hive/conf", "msg": "non-zero return code", "rc": 1, "start": "2019-03-09 10:53:23.613645", "stderr": "19/03/09 10:53:27 INFO hdfs.DFSClient: Exception in createBlockOutputStream\njava.net.ConnectException: Connection refused\n\tat sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)\n\tat sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)\n\tat org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)\n\tat org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)\n\tat org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1923)\n\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1666)\n\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1619)\n\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:771)\n19/03/09 10:53:27 WARN hdfs.DFSClient: Abandoning BP-519781906-192.168.56.100-1552128524558:blk_1073741825_1001\n19/03/09 10:53:27 WARN hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.56.1:50010,DS-24e7211c-66e2-4cf6-b056-a274d6cca4c8,DISK]\n19/03/09 10:53:27 WARN hdfs.DFSClient: DataStreamer Exception\norg.apache.hadoop.ipc.RemoteException(java.io.IOException): File /etc/hive/conf/hive-site.xml.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.\n\tat org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1626)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)\n\tat org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1912)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)\n\n\tat org.apache.hadoop.ipc.Client.call(Client.java:1502)\n\tat org.apache.hadoop.ipc.Client.call(Client.java:1439)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)\n\tat com.sun.proxy.$Proxy9.addBlock(Unknown Source)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)\n\tat org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)\n\tat com.sun.proxy.$Proxy10.addBlock(Unknown Source)\n\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1811)\n\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1607)\n\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:771)\ncopyFromLocal: File /etc/hive/conf/hive-site.xml.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.", "stderr_lines": ["19/03/09 10:53:27 INFO hdfs.DFSClient: Exception in createBlockOutputStream", "java.net.ConnectException: Connection refused", "\tat sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)", "\tat sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)", "\tat org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)", "\tat org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)", "\tat org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1923)", "\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1666)", "\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1619)", "\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:771)", "19/03/09 10:53:27 WARN hdfs.DFSClient: Abandoning BP-519781906-192.168.56.100-1552128524558:blk_1073741825_1001", "19/03/09 10:53:27 WARN hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.56.1:50010,DS-24e7211c-66e2-4cf6-b056-a274d6cca4c8,DISK]", "19/03/09 10:53:27 WARN hdfs.DFSClient: DataStreamer Exception", "org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /etc/hive/conf/hive-site.xml.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.", "\tat org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1626)", "\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)", "\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)", "\tat org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)", "\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)", "\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)", "\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)", "\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)", "\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)", "\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)", "\tat java.security.AccessController.doPrivileged(Native Method)", "\tat javax.security.auth.Subject.doAs(Subject.java:422)", "\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1912)", "\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)", "", "\tat org.apache.hadoop.ipc.Client.call(Client.java:1502)", "\tat org.apache.hadoop.ipc.Client.call(Client.java:1439)", "\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)", "\tat com.sun.proxy.$Proxy9.addBlock(Unknown Source)", "\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413)", "\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)", "\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)", "\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)", "\tat java.lang.reflect.Method.invoke(Method.java:498)", "\tat org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)", "\tat org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)", "\tat com.sun.proxy.$Proxy10.addBlock(Unknown Source)", "\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1811)", "\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1607)", "\tat org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:771)", "copyFromLocal: File /etc/hive/conf/hive-site.xml.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation."], "stdout": "", "stdout_lines": []}
FAILED! => {"changed": false, "failed": true, "invocation": {"module_args": {"arguments": "", "enabled": true, "name": "zookeeper-server", "pattern": null, "runlevel": "default", "sleep": null, "state": "restarted"}, "module_name": "service"}, "msg": "Job for zookeeper-server.service failed because a configured resource limit was exceeded. See "systemctl status zookeeper-server.service" and "journalctl -xe" for details.\n"}
Regarding the java requirement how to download rpm package from CLI in RHEL 7, also after downloading where to keep it?
which limit is this talking about?
I have 8GB ram & 500GB HD vm's for this
Thanks for this nicely written playbook.
I am trying to provision this on single machine.
my hosts file is attached.
I have added this repo https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo
Installed Java.
And provisioned like this.
ansible-cloudera-hadoop]# ansible-playbook -i hosts site.yaml --connection=local --become
The error I get is quite long this is interesting part
TASK [hadoop : init secondary instance] ****************************************
fatal: [localhost]: FAILED! =>
INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]\n17/04/28 18:37:43 INFO namenode.NameNode: createNameNode [-bootstrapStandby]\n17/04/28 18:37:43 ERROR namenode.NameNode: Failed to start namenode.\njava.io.IOException: org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode.
After completing the ansible installation, I cannot login to hue. Checked the corresponding database table and found it to be empty. How to log in to Hue or create a login?
Also, How to access the dashboard?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.