Coder Social home page Coder Social logo

cloudera-playbook's People

Contributors

anis016 avatar dbeech avatar jimvin avatar jnowakowski avatar jrkinley avatar lhoss avatar maciejkowalczyk avatar rafaelarana avatar roczei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloudera-playbook's Issues

New Project, Ansible deployment

Hi,

I have created a new ansible script for multi-note deployment and configuration of CDH. I want to publish this is this or a related repository. What are the steps to do this?

Thanks,

Vicky

AnsibleUndefinedVariable: 'dict object' has no attribute

Basically running playbook from CM host and getting below error, tried lot of things however could not figure out what is going wrong, if it’s problem with variables in group_var or am I missing something.

TASK [scm : file] **********************************************************************************************************************************************************************************************************************************************************************************************************************************************************
changed: [kkulkani-cdhkerberos-1]

TASK [scm : Import KDC admin credentials] **********************************************************************************************************************************************************************************************************************************************************************************************************************************
ok: [kkulkani-cdhkerberos-1]

TASK [scm : Wait for agent heartbeats] *************************************************************************************************************************************************************************************************************************************************************************************************************************************
Pausing for 30 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
Press 'C' to continue the play or 'A' to abort
ok: [kkulkani-cdhkerberos-1]

TASK [scm : Prepare CMS template] ******************************************************************************************************************************************************************************************************************************************************************************************************************************************
fatal: [kkulkani-cdhkerberos-1]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'kkulkani-cdhkerberos-1'"}

Cluster Template Cannot Be Imported

I have been slowly chugging through getting this repo to work for me, and I think I have finally hit a problem I don't know how to solve whatsoever myself. I am setting up with an SCM server, a DB server, an edge server, 2 name nodes, and 4 data nodes (different from readme by lack of a 3rd name node and no KRB5) This is the error I am getting:

TASK [cdh : Wait for import cluster template command to complete] ****************************************************************************************************************************************** FAILED - RETRYING: Wait for import cluster template command to complete (10 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (9 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (8 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (7 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (6 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (5 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (4 retries left). fatal: [scm-server.eastus.companyname.com]: FAILED! => {"attempts": 8, "changed": false, "connection": "close", "content": "{\n \"id\" : 34,\n \"name\" : \"ClusterTemplateImport\",\n \"startTime\" : \"2020-01-17T00:20:36.163Z\",\n \"endTime\" : \"2020-01-17T00:27:04.770Z\",\n \"active\" : false,\n \"success\" : false,\n \"resultMessage\" : \"Failed to import cluster template.\",\n \"children\" : {\n \"items\" : [ {\n \"id\" : 46,\n \"name\" : \"First Run\",\n \"startTime\" : \"2020-01-17T00:27:03.941Z\",\n \"endTime\" : \"2020-01-17T00:27:04.765Z\",\n \"active\" : false,\n \"success\" : false,\n \"resultMessage\" : \"Failed to perform First Run of services.\"\n }, {\n \"id\" : 36,\n \"name\" : \"DeployParcels\",\n \"startTime\" : \"2020-01-17T00:20:36.446Z\",\n \"endTime\" : \"2020-01-17T00:27:00.061Z\",\n \"active\" : false,\n \"success\" : true,\n \"resultMessage\" : \"The Following parcels successfully activated : CDH-6.3.2-1.cdh6.3.2.p0.1605554.\",\n \"clusterRef\" : {\n \"clusterName\" : \"cluster_1\",\n \"displayName\" : \"cluster_1\"\n }\n } ]\n },\n \"canRetry\" : true\n}", "content_type": "application/json;charset=utf-8", "cookies": {"CLOUDERA_MANAGER_SESSIONID": "node0lu3fbcycum4o1wjvg7w5o2rjl16974.node0"}, "cookies_string": "CLOUDERA_MANAGER_SESSIONID=node0lu3fbcycum4o1wjvg7w5o2rjl16974.node0", "date": "Fri, 17 Jan 2020 00:27:43 GMT", "elapsed": 0, "expires": "Thu, 01 Jan 1970 00:00:00 GMT", "failed_when_result": true, "json": {"active": false, "canRetry": true, "children": {"items": [{"active": false, "endTime": "2020-01-17T00:27:04.765Z", "id": 46, "name": "First Run", "resultMessage": "Failed to perform First Run of services.", "startTime": "2020-01-17T00:27:03.941Z", "success": false}, {"active": false, "clusterRef": {"clusterName": "cluster_1", "displayName": "cluster_1"}, "endTime": "2020-01-17T00:27:00.061Z", "id": 36, "name": "DeployParcels", "resultMessage": "The Following parcels successfully activated : CDH-6.3.2-1.cdh6.3.2.p0.1605554.", "startTime": "2020-01-17T00:20:36.446Z", "success": true}]}, "endTime": "2020-01-17T00:27:04.770Z", "id": 34, "name": "ClusterTemplateImport", "resultMessage": "Failed to import cluster template.", "startTime": "2020-01-17T00:20:36.163Z", "success": false}, "msg": "OK (unknown bytes)", "redirected": false, "set_cookie": "CLOUDERA_MANAGER_SESSIONID=node0lu3fbcycum4o1wjvg7w5o2rjl16974.node0;Path=/;HttpOnly", "status": 200, "url": "http://scm-server.eastus.companyname.com:7180/api/v33/commands/34", "x_content_type_options": "nosniff", "x_frame_options": "DENY", "x_xss_protection": "1; mode=block"}

Yes, that is what it gives, \n's and all. Reading through it, I cannot figure out what is going wrong. The cloudera server is running, however I do see a some errors when looking through the log file (these happen 100's of lines apart, condensed for reading ):

2020-01-17 00:27:02,724 ERROR scm-web-475:com.cloudera.cmf.service.AbstractRoleHandler: Unable to generate configuration for GATEWAY base group 2020-01-17 00:27:02,725 WARN scm-web-475:com.cloudera.server.cmf.descriptor.components.DescriptorFactory: Could not generate client configs for service: YARN (MR2 Included) java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

2020-01-17 00:27:02,848 WARN scm-web-475:com.cloudera.server.cmf.descriptor.components.DescriptorFactory: Could not generate client configs for service: Hive java.lang.RuntimeException: java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

2020-01-17 00:31:43,798 INFO scm-web-494:com.cloudera.api.ApiExceptionMapper: Exception caught in API invocation. Msg:Role does not have a process. java.util.NoSuchElementException: Role does not have a process.

2020-01-17 00:31:43,885 WARN scm-web-494:com.cloudera.server.cmf.components.OperationsManagerImpl: Exception while building client config: java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

2020-01-17 00:31:43,888 WARN scm-web-494:com.cloudera.api.ApiExceptionMapper: Unexpected exception. Msg:java.lang.IllegalStateException: Failed to create client configuration for service yarn java.lang.RuntimeException: java.lang.IllegalStateException: Failed to create client configuration for service yarn

2020-01-17 00:31:43,973 WARN scm-web-476:com.cloudera.server.cmf.components.OperationsManagerImpl: Exception while building client config: java.lang.RuntimeException: java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

The list of these goes on for about another 30 errors, all mentioning JOBHISTORY. What am I doing wrong for this to occur? Do I need to be running the third name node?

Action plugin 'scm_hosts.py' uses deprecated CM API client

... and therefore does not support Python 3, which is EOL soon.

What's worse is that because the plugin runs on the Ansible control node, it also means we don't support running Ansible itself with Python 3. We need to fix this code or replicate what it does without using the Python CM API client.

Playbook fails on Ansible 2.9

Error message:

TASK [scm : Extract the host identifiers and names into facts]
****************************************************
task path: /root/cloudera-playbook/roles/scm/tasks/main.yml:81
fatal: [...]: FAILED! => {
   “msg”: “An unhandled exception occurred while running the lookup plugin ‘template’. Error was a <type ‘exceptions.AttributeError’>, original message: ‘VariableManager’ object has no attribute ‘_loader’”
}

Version details:

ansible --version
ansible 2.9.0
 config file = /root/.ansible.cfg
 configured module search path = [u’/root/.ansible/plugins/modules’, u’/usr/share/ansible/plugins/modules’]
 ansible python module location = /usr/lib/python2.7/site-packages/ansible
 executable location = /usr/bin/ansible
 python version = 2.7.5 (default, Apr 11 2018, 07:36:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]

Could be related to ansible/ansible#57437 ?

[Feature request][Enhancement] Maintaining deployments

I'm using this playbook as a base for an offline and secured environment. From the way it works, it looks like it was designed only to deploy clusters. It's not doing anything to maintain them after that. Is there any chance to be developed that way in the future? I'm thinking about features like:
-maintaining service configuration
-creating and maintaining host templates
-detecting changes in the code/templates and applying them

I'm currently working in this direction, but I'm not too efficient due to the fact that I'm kind of new to Ansible and also Cloudera API to some extent. Thanks.

[Feature Request] Tests (using Molecule)

For the moment there are NO tests at all, which makes it more difficult to refactor the code, to test changes and thus to contribute.

My proposal is to change that, using the great Molecule framework.
As a start, I got an initial version done, running in a docker container (with systemd), of course single node first (multiple nodes are possible too)
You can check at my branch: https://github.com/scigility/cloudera-playbook/tree/molecule_tests
Plz note: This is not ready yet, but I wanted to inform the community about the initial work

Playbook fails on cdh : Set_fact task

Error is -

fatal: [cld1.cisco.local]: FAILED! => {
"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'cld5.cisco.local'\n\nThe error appears to be in '/root/cloudera-playbook/roles/cdh/tasks/main.yml': line 40, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# https://www.cloudera.com/documentation/enterprise/latest/topics/install_cluster_template.html\n- set_fact:\n ^ here\n"
}

where cld1.cisco.local is SCM host.
Host file used for this deployment -

[scm_server]
cld1.cisco.local

[db_server]
cld1.cisco.local

[krb5_server]
cld1.cisco.local

[utility_servers:children]
scm_server
db_server
krb5_server

[gateway_servers]
cld1.cisco.local host_template=HostTemplate-Gateway

#[edge_servers]
# host_template=HostTemplate-Edge role_ref_names=HDFS-HTTPFS-1

[master_servers]
cld5.cisco.local host_template=HostTemplate-Master1
cld6.cisco.local host_template=HostTemplate-Master2
cld7.cisco.local host_template=HostTemplate-Master3

[worker_servers]
cld2.cisco.local
cld3.cisco.local
cld4.cisco.local

[worker_servers:vars]
host_template=HostTemplate-Workers

[cdh_servers:children]
utility_servers
gateway_servers
master_servers
worker_servers

#[all:vars]
#ansible_user=ec2-user

Let us know if this issue has been seen earlier and there are any fixes and/or work-around available.

Thanks,

  • Rajesh.

The task includes an option with an undefined variable.

TASK [scm : set_fact] *************************************************************************************************************************************************************************
task path: /home/deploy/cloudera-playbook/roles/scm/tasks/cms.yml:9
fatal: [cdh1.dev]: FAILED! => {
"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'cdh1.dev'\n\nThe error appears to be in '/home/deploy/cloudera-playbook/roles/scm/tasks/cms.yml': line 9, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- set_fact:\n ^ here\n"
}

Role configuration group reference in host template is not valid

I cannot seem to get around this:

TASK [cdh : Import cluster template] **************************************************************************************************************************************************************
fatal: [xxxxxxxxxxxxxxxxx]: FAILED! => {"changed": false, "connection": "close", "content": "{\n "message" : "Role configuration group reference in host template SPARK2_ON_YARN-1-SPARK2_YARN_HISTORY_SERVER-BASE is not valid.\nRole configuration group reference in host template SPARK2_ON_YARN-1-GATEWAY-BASE is not valid."\n}", "content_type": "application/json", "date": "Tue, 28 Apr 2020 18:30:08 GMT", "elapsed": 0, "expires": "Thu, 01-Jan-1970 00:00:00 GMT", "json": {"message": "Role configuration group reference in host template SPARK2_ON_YARN-1-SPARK2_YARN_HISTORY_SERVER-BASE is not valid.\nRole configuration group reference in host template SPARK2_ON_YARN-1-GATEWAY-BASE is not valid."}, "msg": "Status code was 400 and not [200]: HTTP Error 400: Bad Request", "redirected": false, "server": "Jetty(6.1.26.cloudera.4)", "set_cookie": "CLOUDERA_MANAGER_SESSIONID=qz8i0dbv9an36cbnv88y8xdk;Path=/;HttpOnly", "status": 400, "url": "http://xxxxxxxxxxxxxx:7180/api/v19/cm/importClusterTemplate?addRepositories=true"}

"AnsibleUndefinedVariable: 'dict object' has no attribute u'server1.example.com'

Hi,

Im also facing the same issue. Getting error on scm template task. I have modified the inventory file with FQDN also and I have done all possible things but couldn't figure out what is the issue. Appreciated if anyone can provide us quick solution.

TASK [scm : Prepare CMS template] ****************************************************************************
task path: /root/cloudera-playbook/roles/scm/tasks/cms.yml:9

fatal: [server1.example.com]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'server1.example.com'"}

==========================================================
Hosts:

[scm_server]
server1.example.com license_file=/path/to/cloudera_license.txt

[db_server]
server1.example.com

[utility_servers:children]
scm_server
db_server

[gateway_servers]
servergw.example.com host_template=HostTemplate-Gateway role_ref_names=HDFS-HTTPFS-1

[master_servers]
server1.example.com host_template=HostTemplate-Master1

[worker_servers]
server2.example.com

[worker_servers:vars]
host_template=HostTemplate-Workers

[cdh_servers:children]
utility_servers
gateway_servers
master_servers
worker_servers

[all:vars]
ansible_user=root

AnsibleUndefinedVariable: 'dict object' has not attribute u'xxx.xxx.xxx.xxx'"

@roczei - I'm executing playbook from local vm machine (its single node vm, CM is installed on VM), getting error - AnsibleUndefinedVariable: 'dict object' has not attribute u'xxx.xxx.xxx.xxx'"

TASK [scm : Wait for agent heartbeats] *************************************************************************************************************************************************************************************************************************************************************************************************************************************
Pausing for 30 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
Press 'C' to continue the play or 'A' to abort
ok: [xxx.xxx.xxx.xxx

TASK [scm : Prepare CMS template] ****************************************************************************************************************************************
fatal: [xxx.xxx.xxx.xxx]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'xxx.xxx.xxx.xxx'"}

Please note i'm using ip address : 127.0. 0.1., please guide me how to use FQDN name in this case instead of IP address -
/etc/host file look as below -
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost6.localdomain6 localhost6

host file look as below -

[local]
127.0.0.1

[scm_server]
127.0.0.1

[db_server]
127.0.0.1

[utility_servers:children]
scm_server
db_server

#[krb5_server]
#127.0.0.1

Use new method to enable Kerberos via cluster templates (6.3+)

As of Cloudera Manager 6.3 there is a simplified way to enable kerberos via cluster templates, using configuration similar to this:

"instantiator": {
    "clusterName": "test",

     "enableKerberos": {
         "datanodeTransceiverPort" : <optional/default 1004>,
         "datanodeWebPort" : <optional/default 1006>
      },

The playbook could expose this functionality.

Template import fails on CDH 7.1.4

After commenting out Sentry services, using this existing Ansible play fails on template import for CDHv7.1.4. Ansible log has following error -

"json": {
"active": false,
"canRetry": true,
"children": {
"items": [
{
"active": false,
"endTime": "2020-12-09T19:32:15.333Z",
"id": 52,
"name": "First Run",
"resultMessage": "Failed to perform First Run of services.",
"startTime": "2020-12-09T19:32:09.202Z",
"success": false
},
{
"active": false,
"clusterRef": {
"clusterName": "cdpsol-auto-cluster",
"displayName": "cdpsol-auto-cluster"
},
"endTime": "2020-12-09T19:31:59.777Z",
"id": 41,
"name": "DeployParcels",
"resultMessage": "The Following parcels successfully activated : CDH-7.1.4-1.cdh7.1.4.p0.6300266.",
"startTime": "2020-12-09T19:24:52.679Z",
"success": true
}
]
},
"endTime": "2020-12-09T19:32:15.335Z",
"id": 39,
"name": "ClusterTemplateImport",
"resultMessage": "Failed to import cluster template.",
"startTime": "2020-12-09T19:24:52.535Z",
"success": false
},
"msg": "OK (unknown bytes)",
"pragma": "no-cache",
"redirected": false,
"set_cookie": "SESSION=f5f6ea92-46d8-48a0-bd97-f9b6c5aeabf9;Path=/;HttpOnly",
"status": 200,
"url": "http://cdp-scmnode.cisco.local:7180/api/v42/commands/39",
"x_content_type_options": "nosniff",
"x_frame_options": "DENY",

Command status on the UI shows the error as - Command failed to run because service HUE has invalid configuration. First error : Expected dependency of type HIVE_ON_TEZ/HIVE_LLAP but is HIVE.

SS attached herewith.

If we accept the dialogue to fix this issue via cluster url, it goes through successful deployment of cluster.

Any pointers how to fix this through Ansible play book code, would be highly helpful.

cdp-play-error-hive-on-tez

"AnsibleUndefinedVariable: 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute u'leon-test-1.test.com'"

Basically running playbook from CM host and getting below error, tried lot of things however could not figure out what is going wrong.

PLAY [Install CDH] ************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : include_vars] *****************************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : include_vars] *****************************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : include_vars] *****************************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : include_vars] *****************************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : Check whether cluster exists] *************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : set_fact] *********************************************************************************************************************************
ok: [leon-test-1.test.com]

TASK [cdh : debug] ************************************************************************************************************************************
ok: [leon-test-1.test.com] => {
"msg": "Cluster 'cluster_1' exists - False"
}

TASK [cdh : Prepare cluster template] *****************************************************************************************************************
fatal: [leon-test-1.test.com -> localhost]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute u'leon-test-1.test.com'"}
to retry, use: --limit @/root/shell/cluster-script/test/cloudera-playbook-master/cdh.retry

PLAY RECAP ********************************************************************************************************************************************
leon-test-1.test.com : ok=8 changed=0 unreachable=0 failed=1

=============================

hosts
[scm_server]
leon-test-1.test.com license_file=/path/to/cloudera_license.txt

[db_server]
leon-test-1.test.com

[krb5_server]
leon-test-1.test.com default_realm=

[utility_servers:children]
scm_server
db_server
krb5_server

[gateway_servers]
leon-test-1.test.com host_template=HostTemplate-Gateway role_ref_names=HDFS-HTTPFS-1

[master_servers]
leon-test-1.test.com host_template=HostTemplate-Master1
leon-test-2.test.com host_template=HostTemplate-Master2
leon-test-3.test.com host_template=HostTemplate-Master3

[worker_servers]
leon-test-3.test.com
leon-test-4.test.com
leon-test-5.test.com

[worker_servers:vars]
host_template=HostTemplate-Workers

[cdh_servers:children]
utility_servers
gateway_servers
master_servers
worker_servers

AnsibleUndefinedVariable: ''dict object'' has no attribute inside template module

Hi everyone,

I'm currently working on a playbook in wich I need to upload a script to my remote server.
`- name: set currentminio_port
set_fact:
curr_port: "{{ item.value.minio_port }}"

  • name:
    debug:
    msg: "{{ install_dir }}/minio/bin/minio-{{ curr_port }}.sh"

  • name: minio | Install | copy policy_file
    become: yes
    #no_log: true
    template:
    src: "{{ policy_file }}"
    dest: "{{ install_dir }}/minio/policies/minio_policy_{{ vm_role }}_{{ bucket_list }}-{{ curr_port }}.json"
    owner: "{{ user_app }}"
    group: "{{ user_app }}"
    when: policy_file is defined

  • name:
    debug:
    msg: "item: {{ item }} \nitem.key: {{ item.key }}\nitem.value: {{ item.value }}\nitem.value.minio_port: {{ item.value.minio_port }}"

  • name: minio | Install | upload minio script
    become: yes
    #no_log: true
    template:
    src: "minio.sh"
    dest: "{{ install_dir }}/minio/bin/minio-{{ item.value.minio_port }}.sh"
    owner: "{{ user_app }}"
    group: "{{ user_app }}"
    mode: "0750"`

This is the last template task who fail each time I launch the playbook.
I'm looping over a dictionnary that look like that:

ports: dumps: minio_port: 8100 minio_console_port: 8101 executables: minio_port: 8102 minio_console_port: 8103
and to finish my issue I let you see the log so you can check that my variables are reachable.

`TASK [minio : debug] **************************************************************************************************************************************************************************************
ok: [Z36-DV-I1-PSQ01] =>
msg: /opt/minio/bin/minio-8100.sh

TASK [minio : minio | Install | copy policy_file] *********************************************************************************************************************************************************
ok: [Z36-DV-I1-PSQ01]

TASK [minio : debug] **************************************************************************************************************************************************************************************
ok: [Z36-DV-I1-PSQ01] =>
msg: |-
item: {'key': u'dumps', 'value': {u'minio_port': 8100, u'minio_console_port': 8101}}
item.key: dumps
item.value: {u'minio_port': 8100, u'minio_console_port': 8101}
item.value.minio_port: 8100

TASK [minio : minio | Install | upload minio script] ******************************************************************************************************************************************************
fatal: [Z36-DV-I1-PSQ01]: FAILED! => changed=false
msg: 'AnsibleUndefinedVariable: ''dict object'' has no attribute ''minio_port'''

PLAY RECAP ************************************************************************************************************************************************************************************************
Z36-DV-I1-PSQ01 : ok=28 changed=1 unreachable=0 failed=1 skipped=28 rescued=0 ignored=0 `

I hope someone already expereinced this issue and can help me.

Thanks in advance

Running script without license keys

Hi I am setting up cluster without license, I managed to run the script though found few issues in the script. Will share the details, I removed some lines of code from script to run without license as those modules were included in script like advanced reporting,
Can any body suggest what changes need to be made to run the script without license or free mode

5 PRs lost during PR #40 force-push merged 2019-08-30. Let's bring'em back!

As already mentioned in #40 (comment)
I detected a number of former PR/fixes, that were not anymore in master (since too long)

I identified 5 LOST PRs, by looking at:
https://github.com/cloudera/cloudera-playbook/pulls?q=is%3Apr+is%3Aclosed+sort%3Aupdated-desc
(and analyzing the code)
#25, merged 20190827, 11h48
#28, merged 20190827, 11h58
#37, merged 20190828, 10h49
#38, merged 20190829, 19h14
#39, merged 20190830, 11h56


Today I'll focus on bringing back #28, which contains critical fixes required for an install happening today at one customer where imstall runs from AWX/Tower
In can also work on brining back the useful changes from PRs 37-39 (all by @dbeech ), if @dbeech has no time

cdp 7.0.3 template deployed but services are stopped

Please note I am trying to install a single node cluster with all basic services.
I have deployed cdp 7.0.3 and cdh 7.0.3 with some tweaks in this code and the playbook but it failed at the last step on importing cluster template (error is "Could not find Oozie Server for service oozie") and my issues are:

  1. I can see that services are deployed but all services are stopped and in error state
    cdp_cluster_deployed

  2. 4 major services i.e. Hue, Impala, Oozie and Spark are not deployed with their server e.g. oozie server.
    oozie_server_error

I tried to manually install the services and put the oozie server but its not letting me do so.
error_add_oozie_server
However, I can see that the config is already present for oozie database.
oozie_server_database_config

  1. Moreover I see the datanode is not assigned in the cluster.

I am attaching the cluster template i.e. http://x.x.x.x:7180/api/v40/cm/deployment
cdp_cluster_template.txt

I added the hdfs role later (not included in the template). now it asking to deploy minimum 3 journal nodes, that I will figure out.

Please see if we need to modify the cluster and service templates in the cdh role to fix these.

Java role has (oracle)JDK v7 hardcoded

I just hope nobody used it and excluded it in the site.yml playbook (or by using tags, but not 'java' tag) ..
since JDK8 is more than recommended since some years now

As it's not the idea to use external ansible roles here (of course a large choice out there for Java/JDK installs), 2 good solutions:

Need help in running the playbook

Hi, I see following error while running the playbook.
"msg": "No hosts defined in SCM"

It's fails in the,

  • name: Get SCM hostIds for inventory hosts
    action: scm_hosts
    register: scm_hosts_result

I tried looking at the scm_hosts.py under action_plugin but no luck. Can someone help on this pls ?

Playbook fails when trying to install OpenJDK 11

The JCE installation code tries to check/edit the file $JAVA_HOME/jre/lib/security/java.security but this does not exist after installation of java-11-openjdk-devel package. We need to ignore errors or skip that task.

Update this repo with fixes made by other contributors

Hi, this repo has been very stale and I see that other contributors have forked this project and made some valuable changes. Would you be able to aggregate those changes and update this repository in order to have a more central and trustworthy repository to look for.

Thanks in advance! ;)

Cloudera Web Server doesn't start

I have been working with this repo for the past couple days and have had a recurring issue that I cannot seem to solve. I am using Microsoft Azure as a server host, and I am trying to run this playbook without KRB5. I have followed the instructions given, and am receiving this error:

fatal: [business-scm-server.eastus.cloudapp.azure.com]: FAILED! => {"changed": false, "elapsed": 301, "msg": "Timeout when waiting for business-scm-server.eastus.cloudapp.azure.com:7180"}

I have checked the scm server to determine that cloudera is running, however I am unable to connect to it through browser. I am unable to determine the cause of this. Did I have to change something not in the documentation due to Azure?

No service type 'SENTRY' available for cluster with version 'CDH 7.0.3'

Hi I am trying to install cdp 7.0.3 using this playbook and I have managed to install it. The second last step of play "Import cluster template" didn't run successfully and it failed with error below. Could someone help me here please

 java.lang.IllegalArgumentException: No service type 'SENTRY' available for cluster with version 'CDH 7.0.3'.

keep getting "invalid role mapping HDFS-HTTPFS-1" for "import cluster template task" when running ansible playbook role

fatal: [52.61.239.57]: FAILED! => {"changed": false, "connection": "close", "content": "{\n \"message\" : \"Invalid role mapping for HDFS-HTTPFS-1. No role group configuration of type HTTPFS is present in the host template HostTemplate-Gateway.\"\n}", "content_type": "application/json;charset=utf-8", "date": "Mon, 30 Sep 2019 21:06:45 GMT", "elapsed": 0, "expires": "Thu, 01 Jan 1970 00:00:00 GMT", "json": {"message": "Invalid role mapping for HDFS-HTTPFS-1. No role group configuration of type HTTPFS is present in the host template HostTemplate-Gateway."}, "msg": "Status code was 400 and not [200]: HTTP Error 400: Bad Request", "redirected": false, "set_cookie": "CLOUDERA_MANAGER_SESSIONID=node07hsc8tsx0kuffp3j6ziceh437.node0;Path=/;HttpOnly", "status": 400, "url": "http://52.61.239.57:7180/api/v33/cm/importClusterTemplate?addRepositories=true", "x_content_type_options": "nosniff", "x_frame_options": "DENY", "x_xss_protection": "1; mode=block"}

error on tmp/scm.json file

I get error on temp file, when I switch value in ansible.cfg value of pipelining the error goes away and come back not sure why?


fatal: [cloudmanager.ee-hadoop.com.au]: FAILED! => {"msg": "Failed to get information on remote file (/home/hadoop/tmp/scm.json): Sorry, try again.\n[sudo via ansible, key=bkfkxloiefebvjjruekluqqrpzaeymwo] password: \nsudo: 1 incorrect password attempt\n"}

Intermittent timeouts when downloading MySQL/J connector zip

From testing today we observed that sometimes the download from https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.48.zip worked and sometimes timed-out. When it does work, it actually redirects to a CDN (https://cdn.mysql.com/Downloads/Connector-J/mysql-connector-java-5.1.48.zip)

I guess there could be some rate-limiting in place?

mariadb fails to start

On a centos 7.4 system, the /roles/mariadb/tasks/main.yml fails when trying to start mariadb after successfully creating the configuration file, the log file and PID directory.

A manual yum install of mariadb-server works fine.

Logs attached:
logs.txt
job_35.txt

krb5_kdc_type not registering properly?

I set
krb5_kdc_type: none
in all

but I got:

TASK [scm : Update Cloudera Manager settings] ************************************************************
fatal: [ip-10-0-1-170.ap-southeast-2.compute.internal]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'krb5_server'\n\nThe error appears to be in '/home/cdh_terraform_aws/cloudera-playbook/roles/scm/tasks/scm.yml': line 4, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# https://cloudera.github.io/cm_api/apidocs/v13/path__cm_config.html\n- name: Update Cloudera Manager settings\n  ^ here\n"}

Everything else was straight from the master branch
any ideas?

Ubuntu Support

Is there any intent or desire to support ubuntu as a platform?

scm roles set_fact issue related to cms template to submit

I have an issue related to set_fact on cms tasks:

TASK [scm : set_fact] **************************************************************************************************************************************************************************************************************
fatal: [admin-cdh-dev.intra.local]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'admin-cdh-dev.intra.local'\n\nThe error appears to be in '/home/ansible/playbooks/oat/roles/scm/tasks/cms.yml': line 9, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- set_fact:\n  ^ here\n"}
PLAY RECAP *************************************************************************************************************************************************************************************************************************
admin-cdh-dev.intra.local  : ok=67   changed=5    unreachable=0    failed=1    skipped=24   rescued=0    ignored=0
edge01-cdh-dev.intra.local : ok=42   changed=3    unreachable=0    failed=0    skipped=14   rescued=0    ignored=0
master01-cdh-dev.intra.local : ok=42   changed=3    unreachable=0    failed=0    skipped=14   rescued=0    ignored=0
master02-cdh-dev.intra.local : ok=42   changed=3 

All nodes are registred to cloudera servers:

 Connexion : vendredi 22 janvier 2021 à 22:23:47 CET de master01-cdh-dev.intra.local sur pts/7
[ansible@master01-cdh-dev ~]$ curl -s -H "Accept: application/json" -H "Content-Type: application/json" --user "admin:admin" http://admin-cdh-dev.intra.local:7180/api/v33/hosts
{
  "items" : [ {
    "maintenanceOwners" : [ ],
    "hostId" : "1b7229a2-5ec3-4b65-990f-8011bbd079cf",
    "ipAddress" : "10.181.24.153",
    "hostname" : "admin-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/1b7229a2-5ec3-4b65-990f-8011bbd079cf",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 2,
    "numPhysicalCores" : 2,
    "totalPhysMemBytes" : 8201801728
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "b76b87ef-dfe2-488c-8305-3bc7d17658d1",
    "ipAddress" : "10.181.24.151",
    "hostname" : "edge01-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/b76b87ef-dfe2-488c-8305-3bc7d17658d1",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 8,
    "numPhysicalCores" : 8,
    "totalPhysMemBytes" : 33566892032
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "1a573cd3-e32a-4339-b588-956927be2d50",
    "ipAddress" : "10.181.24.145",
    "hostname" : "master01-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/1a573cd3-e32a-4339-b588-956927be2d50",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 4,
    "numPhysicalCores" : 4,
    "totalPhysMemBytes" : 16657203200
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "3fb6cb06-a560-4401-a854-0a82ab349cf8",
    "ipAddress" : "10.181.24.146",
    "hostname" : "master02-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/3fb6cb06-a560-4401-a854-0a82ab349cf8",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 4,
    "numPhysicalCores" : 4,
    "totalPhysMemBytes" : 16657203200
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "2805e250-f367-4a83-a003-201d850f0780",
    "ipAddress" : "10.181.24.147",
    "hostname" : "master03-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/2805e250-f367-4a83-a003-201d850f0780",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 4,
    "numPhysicalCores" : 4,
    "totalPhysMemBytes" : 16657203200
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "a397854a-9024-4cc2-9c8d-4574301622cc",
    "ipAddress" : "10.181.24.152",
    "hostname" : "utility-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/a397854a-9024-4cc2-9c8d-4574301622cc",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 8,
    "numPhysicalCores" : 8,
    "totalPhysMemBytes" : 33566892032
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "8009ebf7-5482-4ed3-8ab8-bd941d0033aa",
    "ipAddress" : "10.181.24.148",
    "hostname" : "worker01-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/8009ebf7-5482-4ed3-8ab8-bd941d0033aa",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 8,
    "numPhysicalCores" : 8,
    "totalPhysMemBytes" : 33566892032
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "01f4a1c0-e4ca-47c1-aafc-0825d69cfe94",
    "ipAddress" : "10.181.24.149",
    "hostname" : "worker02-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/01f4a1c0-e4ca-47c1-aafc-0825d69cfe94",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 8,
    "numPhysicalCores" : 8,
    "totalPhysMemBytes" : 33566892032
  }, {
    "maintenanceOwners" : [ ],
    "hostId" : "eb80d0f6-6a52-4eaa-a52c-31b71c34d84f",
    "ipAddress" : "10.181.24.150",
    "hostname" : "worker03-cdh-dev.intra.local",
    "rackId" : "/default",
    "hostUrl" : "http://admin-cdh-dev.intra.local:7180/cmf/hostRedirect/eb80d0f6-6a52-4eaa-a52c-31b71c34d84f",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "numCores" : 8,
    "numPhysicalCores" : 8,
    "totalPhysMemBytes" : 33566892032
  } ]

Trying to figure out any logical explanation on this however i am struggling to understand why it's blocking on cms template creation.

By the way my current setup is to set krb5_kdc_type to none and removing krb5_server from the hosts, however that lead me to an issue issue#70 and to solve it i only ended up adding krb5_server to host while keeping the krb5_kdc_type: 'none'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.