saltstack-formulas / hadoop-formula Goto Github PK
View Code? Open in Web Editor NEWHome Page: http://docs.saltstack.com/en/latest/topics/development/conventions/formulas.html
License: Other
Home Page: http://docs.saltstack.com/en/latest/topics/development/conventions/formulas.html
License: Other
I may be misunderstanding saltstack usage. Here is my case: I don't want to specify roles via grains and instead would like to use nodegroups, pillars and/or states to define which nodes get installed which services / components of hadoop
Quick example, I would to be able to have a top.sls which contains:
base:
'datanode-*':
- hadoop.hdfs.slave # or use hadoop.hdfs.datanode
'namenode-*':
- hadoop.hdfs.master # or use hadoop.hdfs.namenode hope you get the idea
Hi,
Just curious where you got the hdp tarball urls from. I am trying to update the formula as I would like to use the latest hdp 2.3.4 which features spark 1.5.2.
All I can find is RPM refs...
Thanks,
Ivan
2015-02-14 03:08:03,475 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state INITED; cause: java.lang.IllegalStateException: Queue configuration missing child queue names for root
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:512)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:436)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:301)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:436)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:825)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:227)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1040)
2015-02-14 03:08:03,475 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
2015-02-14 03:08:03,475 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state
2015-02-14 03:08:03,475 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:512)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:436)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:301)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:436)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:825)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:227)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1040)
2015-02-14 03:08:03,477 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
stderr:
/usr/lib/hadoop/bin/hdfs: line 46: .: /usr/lib/hadoop: is a directory
stdout:
DBG: bin is "/usr/lib/hadoop/bin"
DBG: before profile.d kicks in: DEFAULT_LIBEXEC_DIR is /usr/lib/hadoop/bin/../libexec
DBG: also HADOOP_HOME is "/usr/lib/hadoop
"
The HADOOP_HOME is resolved correctly when the hdfs command is executed in a regular shell - not so much when run by cmd.run (I have no clue why)
The HADOOP_HOME variable is only defined in /etc/profile.d/hadoop.sh which is a jinja template that the hadoop state puts in place. Will for now remove the deprecated value of HADOOP_HOME from the profile.d script and see what this does
# salt --versions-report
Salt: 2015.5.1
Python: 2.6.9 (unknown, Apr 1 2015, 18:16:00)
Jinja2: 2.7.2
M2Crypto: 0.21.1
msgpack-python: 0.4.6
msgpack-pure: Not Installed
pycrypto: 2.6.1
libnacl: Not Installed
PyYAML: 3.10
ioflo: Not Installed
PyZMQ: 14.3.1
RAET: Not Installed
ZMQ: 3.2.5
Mako: Not Installed
format-namenode
executes with a newline in classpath which stemmed from HADOOP_CONF_DIR environment variable. This is mentioned in saltstack/salt#24191 and fixed/merged in saltstack/salt#24454. You can see this in the STARTUP_MSG: classpath
below. Also, notice that the script is also using defaults such as /tmp/hadoop-hdfs
as the data directory.
ID: format-namenode
Function: cmd.run
Name: /usr/lib/hadoop/bin/hdfs namenode -format
Result: True
Comment: Command "/usr/lib/hadoop/bin/hdfs namenode -format" run
Started: 14:04:23.583055
Duration: 6719.372 ms
Changes:
----------
pid:
1832
retcode:
0
stderr:
stdout:
2015-06-26 14:04:23,985 INFO [main] namenode.NameNode (StringUtils.java:startupShutdownMessage(619)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hdfs-master1.dev.coinsmith.co/192.168.1.175
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.5.2
STARTUP_MSG: classpath = /etc/hadoop/conf
:/usr/lib/hadoop/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/lib/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/lib/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/lib/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/lib/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/lib/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/lib/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/lib/hadoop/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/lib/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/lib/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/lib/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/lib/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/lib/hadoop/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/lib/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/lib/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-el-1.0.jar:/usr/lib/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/lib/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/lib/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/share/hadoop/common/lib/hadoop-annotations-2.5.2.jar:/usr/lib/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop/share/hadoop/common/lib/hadoop-auth-2.5.2.jar:/usr/lib/hadoop/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/lib/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/lib/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/lib/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/lib/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/lib/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/lib/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/lib/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/lib/hadoop/share/hadoop/common/hadoop-nfs-2.5.2.jar:/usr/lib/hadoop/share/hadoop/common/hadoop-common-2.5.2.jar:/usr/lib/hadoop/share/hadoop/common/hadoop-common-2.5.2-tests.jar:/usr/lib/hadoop/share/hadoop/hdfs:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/lib/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.5.2-tests.jar:/usr/lib/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.5.2.jar:/usr/lib/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/activation-1.1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/asm-3.2.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/xz-1.0.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/lib/hadoop/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-api-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-client-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-2.5.2.jar:/usr/lib/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/hadoop-annotations-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/lib/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.5.2-tests.jar:/usr/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.5.2.jar
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r cc72e9b000545b86b75a61f4835eb86d57bfafc0; compiled by 'jenkins' on 2014-11-14T23:45Z
STARTUP_MSG: java = 1.8.0_40
************************************************************/
2015-06-26 14:04:23,991 INFO [main] namenode.NameNode (SignalLogger.java:register(91)) - registered UNIX signal handlers for [TERM, HUP, INT]
2015-06-26 14:04:23,993 INFO [main] namenode.NameNode (NameNode.java:createNameNode(1342)) - createNameNode [-format]
Formatting using clusterid: CID-c8e542a8-56cc-4692-af51-17a16fb8e0f3
2015-06-26 14:04:29,925 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(739)) - fsLock is fair:true
2015-06-26 14:04:29,952 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:<init>(229)) - dfs.block.invalidate.limit=1000
2015-06-26 14:04:29,952 INFO [main] blockmanagement.DatanodeManager (DatanodeManager.java:<init>(235)) - dfs.namenode.datanode.registration.ip-hostname-check=true
2015-06-26 14:04:29,953 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2015-06-26 14:04:29,955 INFO [main] blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2015 Jun 26 14:04:29
2015-06-26 14:04:29,956 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlocksMap
2015-06-26 14:04:29,956 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit
2015-06-26 14:04:29,957 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 889 MB = 17.8 MB
2015-06-26 14:04:29,957 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^21 = 2097152 entries
2015-06-26 14:04:29,974 INFO [main] blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(354)) - dfs.block.access.token.enable=false
2015-06-26 14:04:29,978 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(339)) - defaultReplication = 3
2015-06-26 14:04:29,978 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(340)) - maxReplication = 512
2015-06-26 14:04:29,978 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(341)) - minReplication = 1
2015-06-26 14:04:29,981 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(342)) - maxReplicationStreams = 2
2015-06-26 14:04:29,981 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(343)) - shouldCheckForEnoughRacks = false
2015-06-26 14:04:29,981 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(344)) - replicationRecheckInterval = 3000
2015-06-26 14:04:29,981 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(345)) - encryptDataTransfer = false
2015-06-26 14:04:29,981 INFO [main] blockmanagement.BlockManager (BlockManager.java:<init>(346)) - maxNumBlocksToLog = 1000
2015-06-26 14:04:29,985 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(758)) - fsOwner = hdfs (auth:SIMPLE)
2015-06-26 14:04:29,985 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(759)) - supergroup = supergroup
2015-06-26 14:04:29,985 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(760)) - isPermissionEnabled = true
2015-06-26 14:04:29,985 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(771)) - HA Enabled: false
2015-06-26 14:04:29,987 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(808)) - Append Enabled: true
2015-06-26 14:04:30,088 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map INodeMap
2015-06-26 14:04:30,088 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit
2015-06-26 14:04:30,088 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 1.0% max memory 889 MB = 8.9 MB
2015-06-26 14:04:30,088 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^20 = 1048576 entries
2015-06-26 14:04:30,089 INFO [main] namenode.NameNode (FSDirectory.java:<init>(209)) - Caching file names occuring more than 10 times
2015-06-26 14:04:30,093 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map cachedBlocks
2015-06-26 14:04:30,093 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit
2015-06-26 14:04:30,093 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 0.25% max memory 889 MB = 2.2 MB
2015-06-26 14:04:30,093 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^18 = 262144 entries
2015-06-26 14:04:30,094 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(5095)) - dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2015-06-26 14:04:30,095 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(5096)) - dfs.namenode.safemode.min.datanodes = 0
2015-06-26 14:04:30,095 INFO [main] namenode.FSNamesystem (FSNamesystem.java:<init>(5097)) - dfs.namenode.safemode.extension = 30000
2015-06-26 14:04:30,095 INFO [main] namenode.FSNamesystem (FSNamesystem.java:initRetryCache(892)) - Retry cache on namenode is enabled
2015-06-26 14:04:30,095 INFO [main] namenode.FSNamesystem (FSNamesystem.java:initRetryCache(900)) - Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2015-06-26 14:04:30,106 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map NameNodeRetryCache
2015-06-26 14:04:30,106 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit
2015-06-26 14:04:30,106 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 0.029999999329447746% max memory 889 MB = 273.1 KB
2015-06-26 14:04:30,106 INFO [main] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^15 = 32768 entries
2015-06-26 14:04:30,113 INFO [main] namenode.NNConf (NNConf.java:<init>(62)) - ACLs enabled? false
2015-06-26 14:04:30,113 INFO [main] namenode.NNConf (NNConf.java:<init>(66)) - XAttrs enabled? true
2015-06-26 14:04:30,113 INFO [main] namenode.NNConf (NNConf.java:<init>(74)) - Maximum size of an xattr: 16384
2015-06-26 14:04:30,134 INFO [main] namenode.FSImage (FSImage.java:format(145)) - Allocated new BlockPoolId: BP-1699152797-192.168.1.175-1435345470119
2015-06-26 14:04:30,199 INFO [main] common.Storage (NNStorage.java:format(550)) - Storage directory /tmp/hadoop-hdfs/dfs/name has been successfully formatted.
2015-06-26 14:04:30,294 INFO [main] namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:getImageTxIdToRetain(203)) - Going to retain 1 images with txid >= 0
2015-06-26 14:04:30,295 INFO [main] util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 0
2015-06-26 14:04:30,296 INFO [Thread-1] namenode.NameNode (StringUtils.java:run(645)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hdfs-master1.dev.coinsmith.co/192.168.1.175
************************************************************/
Using SaltStack's Debian repositories. Here's a version report:
Salt: 2015.5.0
Python: 2.7.9 (default, Mar 1 2015, 12:57:24)
Jinja2: 2.7.3
M2Crypto: 0.21.1
msgpack-python: 0.4.2
msgpack-pure: Not Installed
pycrypto: 2.6.1
libnacl: Not Installed
PyYAML: 3.11
ioflo: Not Installed
PyZMQ: 14.4.0
RAET: Not Installed
ZMQ: 4.0.5
Mako: 1.0.0
Debian source package: 2015.5.0+ds-1~bpo8+1
I'm pretty sure it's my pillar config, but I can't get it. Anybody else see this when trying to use hadoop.hdfs?
hdfs:
config:
hdfs-site:
dfs.datanode.synconclose:
value: true
dfs.durable.sync:
value: true
dfs.permission:
value: false
namenode_http_port: 50070
namenode_port: 8020
secondarynamenode_http_port: 50090
datanode_target: "roles:hadoop_slave"
namenode_target: "roles:hadoop_master"
Hello @sroegner ,
I would like to provide PR which will extend an existing pillar options with clusters
option to deploy several hadoop clusters.
I would like to hear your review and whether you like it or not.
If you do, I can provide the PR. If you don't, please, say why.
Here is an example of additional parameters in pillar:
hdfs:
clusters:
cluster1:
primary_namenode: 192.168.0.101
secondary_namenode: 192.168.0.102
journalnodes:
- 192.168.0.101
- 192.168.0.102
- 192.168.0.103
zookeeper_connection_string: 192.168.0.101:2181,192.168.0.102:2182,192.168.0.103:2183
datanodes:
- 192.168.0.104
- 192.168.0.105
- 192.168.0.106
cluster2:
primary_namenode: 192.168.1.101
secondary_namenode: 192.168.1.102
journalnodes:
- 192.168.1.101
- 192.168.1.102
- 192.168.1.103
zookeeper_connection_string: 192.168.1.101:2181,192.168.1.102:2182,192.168.1.103:2183
datanodes:
- 192.168.1.104
- 192.168.1.105
- 192.168.1.106
mapred:
clusters:
cluster1:
jobtracker: 192.168.0.101
jobtracker_on_primary_namenode: Flase
tasktrackers_on_datanodes: False
tasktrackers:
- 192.168.0.104
- 192.168.0.105
- 192.168.0.106
cluster2:
jobtracker: 192.168.1.101
jobtracker_on_primary_namenode: True
tasktrackers_on_datanodes: True
tasktrackers:
- 192.168.1.104
- 192.168.1.105
- 192.168.1.106
yarn:
clusters:
cluster1:
primary_resourcemanager: 192.168.0.101
secondary_resourcemanager: 192.168.0.102
resource_manager_on_namenode: True
nodemanagers_on_datanodes: True
zookeeper_connection_string: 192.168.0.101:2181,192.168.0.102:2182,192.168.0.103:2183
nodemanagers:
- 192.168.0.104
- 192.168.0.105
- 192.168.0.106
cluster2:
primary_resource_manager: 192.168.0.101
secondary_resource_manager: 192.168.0.102
resource_manager_on_namenode: False
nodemanagers_on_datanodes: False
zookeeper_connection_string: 192.168.1.101:2181,192.168.1.102:2182,192.168.1.103:2183
nodemanagers:
- 192.168.1.104
- 192.168.1.105
- 192.168.1.106
About mapred:
In case jobtracker_on_primary_namenode
is equal to True, then jobtracker
parameter isn't used. Job tracker will be installed on primary namenode.
In case tasktrackers_on_datanodes
is equal to True then task trackers will be also installed on all data nodes (in addition with tasktrackers
list).
In case there are no tasktrackers
parameters in the cluster and tasktrackers_on_datanodes
is set to False, targeting methods will be used.
About yarn:
In case resourcemanager_on_namenode
is equal to True, then primary_resource_manager
and secondary_resource_manager
parameters are ignored. Resource manager will be installed on a primary namenode or on both primary and secondary namenodes if secondary namenode is specified.
In case nodemanagers_on_datanodes
is equal to True, then nodemanagers
will be also installed on all data nodes (in addition with nodemanagers
list).
In case there are no nodemanagers
parameter in the cluster and nodemanagers_on_datanodes
is set to False, targeting methods will be used.
zookeeper_connection_string
is an optional parameter. In case it is empty Hadoop formula will fetch connection string from Zookeeper formula.
About yarn. If you don't want to complicate it, I can remove yarn high availability feature.
Best regards,
Alexandr
I'd like to verify this behavior in a RedHat based distro. I'll check myself if no one comments.
Init scripts usually manage creating a directory in /var/run
upon start/restart (Source). Can this be done here?
I'm not experienced enough with CDH/YARN to know what's really supposed to happen from the distro - but the 'container-executor' binary isn't there in the ${hadoop}/bin directory - or anywhere else. This causes the state to fail. Anyone else see this?
ID: fix-executor-group
Function: cmd.run
Name: chown root /usr/lib/hadoop/bin/container-executor
Result: False
Comment: Command "chown root /usr/lib/hadoop/bin/container-executor" run
Started: 14:33:50.911201
Duration: 46.19 ms
Changes:
----------
pid:
10772
retcode:
1
stderr:
chown: cannot access `/usr/lib/hadoop/bin/container-executor': No such file or directory
stdout:
----------
ID: fix-executor-group
Function: cmd.run
Name: chgrp yarn /usr/lib/hadoop/bin/container-executor
Result: False
Comment: Command "chgrp yarn /usr/lib/hadoop/bin/container-executor" run
Started: 14:33:50.957615
Duration: 48.23 ms
Changes:
----------
pid:
10787
retcode:
1
stderr:
chgrp: cannot access `/usr/lib/hadoop/bin/container-executor': No such file or directory
stdout:
----------
ID: fix-executor-permissions
Function: cmd.run
Name: chmod 06050 /usr/lib/hadoop/bin/container-executor
Result: False
Comment: One or more requisite failed: {'hadoop.yarn.fix-executor-group': 'Command "chgrp yarn /usr/lib/hadoop/bin/container-executor" run'}
Started:
Duration:
Changes:```
@sroegner, what is the reason that YARN data disks are configurable only through grains? It seems like something one may want to configure globally.
thx!
When I install a new hadoop cluster using cdh4.5.0 MR1, the jobtracker UI won't load out of the box. Instead, when I point my browser to master:50030 I see a directory listing.
If I copy /usr/lib/hadoop/share/hadoop/mapreduce/webapps to /usr/lib/hadoop and restart the jobtracker, the UI works. I tried making that a symlink, but jetty is not willing to follow the symlink for static files.
I think the solution is probably to adjust the classpath somehow but it's not immediately clear to me what the best way to do that is.
On a fresh install, I get the following error when highstating the minions:
ID: /etc/hadoop/conf/hdfs-site.xml
Function: file.managed
Result: False
Comment: Unable to manage file: Jinja variable 'hdfs_repl_override' is undefined
Started: 14:04:47.055940
Duration: 1270.958 ms
Changes:
the hadoop state attempts to install the redhat-lsb pkg on Redhat flavor OS' which fails on amazon os 2, need to add a condition here to cover this new case
Not sure this is a formula problem but I am seeing all downloads going to the default URLs coded in settings.sls
Whatever the real size of the provisioned hadoop cluster is - the formula will currently set the hdfs-site property dfs.replication to 1, unless the attribute is overridden in the pillar key hdfs:hdfs-site:dfs.replication
This important property should by default reflect the number of datanodes available at provisioning time:
Implementation is - as is most of this formula - based on the availability of salt mine and proper roles configuration (every host in roles:hadoop_slave is an hdfs datanode).
When I run salt-run state.orchestrate provision_hadoop, I get no errors except ones saying the service isn't running:
On the hadoop_slave:
ID: hdfs-services
Function: service.running
Name: hadoop-datanode
Result: False
Comment: The named service hadoop-datanode is not available
Started: 14:07:18.029442
Duration: 18.218 ms
Changes:
and on the hadoop_master:
----------
ID: hdfs-nn-services
Function: service.running
Name: hadoop-namenode
Result: False
Comment: The named service hadoop-namenode is not available
Started: 14:07:24.731055
Duration: 18.735 ms
Changes:
----------
ID: hdfs-nn-services
Function: service.running
Name: hadoop-secondarynamenode
Result: False
Comment: The named service hadoop-secondarynamenode is not available
Started: 14:07:24.750181
Duration: 18.097 ms
Changes:
I'm using the example provision_hadoop.sls file:
prep:
salt.state:
- tgt: '*'
- sls:
- hostsfile
- hostsfile.hostname
- ntp.server
- sun-java
- sun-java.env
# the target for hadoop_services only means where the binaries and config will end up
# targetting for service startup and configuration is done on the service level in pillars
hadoop_services:
salt.state:
- tgt: 'G@roles:hadoop_master or G@roles:hadoop_slave'
- tgt_type: compound
- require:
- salt: prep
- sls:
- hadoop
- hadoop.hdfs
- hadoop.yarn
I have this in my pillar:
hadoop:
targeting_method: compound
hdfs:
config:
hdfs-site:
dfs.datanode.synconclose:
value: true
dfs.durable.sync:
value: true
dfs.permission:
value: false
namenode_http_port: 50070
namenode_port: 8020
secondarynamenode_http_port: 50090
datanode_target: "G@roles:hadoop_slave"
namenode_target: "G@roles:hadoop_master"
This in the /etc/salt/grains file on the hadoop_master:
roles:
- hadoop_master
hdfs_data_disks:
- /hdfs
yarn_data_disks:
- /yarn
and this in the grains file on the hadoop_slave:
roles:
- hadoop_slave
hdfs_data_disks:
- /hdfs
yarn_data_disks:
- /yarn
Is there something else I need to specify in the pillar variables to get the services to start? Or is there some other issue?
Hi, Is there any specific reason why the formula uses "alt_home" and "real_home" and the corresponding "_config" ?
I can understand that maybe you want to have several versions and manage them with the alternatives system but for me seems counter intuitive to specify the parameter in pillar and see other directory created. I'll give you an example:
Pillar
hadoop:
version: apache-2.7.3
prefix: /usr/lib/hadoop/hadoop-2.7.3
config:
directory: /etc/hadoop
What I'm trying to do with this is having /usr/lib/hadoop/hadoop-2.7.3 as a prefix and in a upgrade apply the same formula with hadoop-2.8.0 for example and have the directory inside /usr/lib/hadoop. Then the alternatives system links the binaries to /usr/bin as it does now.
On the configuration side I only want to have /etc/hadoop without any versioning name (similar to what the formula is doing now with "/etc/hadoop-2.7.3" but without the version). I want it this way because on the execution side the alternatives system only handles "one at a time" so is the administrator responsibility to maintain a config compatible with that binary.
If you don't agree with my way of seeing the configuration then either way my main point is that the formula has side-effects and the information on the pillar doesn't represents exactly the actions of it.
I can make the changes to use the information provided by the pillar and don't use the "alt" "real" options if it's an agreed decision at some point.
Am trying to install Apache Hadoop 2.7.1.
Not sure why I've got a conf-2.2.0 & conf.dist directory under /etc/hadoop with .conf linked to /etc/alternatives/hadoop-conf-link??
nick@fig:/etc/hadoop$ ls -lat
total 28
drwxr-xr-x 147 root root 12288 Jul 16 16:59 ..
drwxr-xr-x 2 root root 4096 Jul 16 15:59 conf-2.7.1
drwxr-xr-x 2 root root 4096 Jul 16 13:46 conf-2.2.0
drwxr-xr-x 5 root root 4096 Jul 16 13:46 .
lrwxrwxrwx 1 root root 34 Jul 16 13:05 conf -> /etc/alternatives/hadoop-conf-link
drwxr-xr-x 2 root root 4096 Jun 29 16:15 conf.dist
have tried removing the offending directories and linking conf -> ./conf-2.7.1. However after a reboot it returns to the above configuration without a salt run.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.