Coder Social home page Coder Social logo

cluster-validation's People

Contributors

dbtucker avatar jbenninghoff avatar jfota avatar nelsonestrada5 avatar rostbach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cluster-validation's Issues

Chat/Discussion

Hi,
any chance you are available for a chat/discussion around comparison I'm doing between running the runTeraGenSort.sh in a bare metal environment vs a virtualised environment ?

thanks
Magnus Andersson

Test upgrade sudo approach

export MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket
sudo -u mapr maprcli config save -values {mapr.targetversion:"cat /opt/mapr/MapRBuildVersion"}
sudo -u mapr maprcli cluster feature enable -all

OR:

su - mapr -c 'MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket maprcli dashboard info -json'

DNS lookup and Reverse DNS lookup not found

I am trying to setup 5 nodes cluster. However I am having issues installing. The installer failed for the last nodes with (Unable to execute command: timeout -s HUP 2m hadoop fs -put -f /opt/mapr/hive/hive-2.1/lib/hive-orc-2.1.1-mapr-1710.jar /installer/hive-2.1/. ).

When I try to run the script "network-test.sh" I get the following error:

DNS lookup (host not found: 2(SERVFAIL))

Reverse DNS lookup (Host 115.10.27.172.in-addr.arpa. not found: 3(NXDOMAIN)

The installer works for single node cluster. Please let me know how to solve this network related issue.

Thank you

Postinst YCSB Ideas

Some ideas on the YCSB Test:

  1. Provide option to create a new volume with X settings to put the MapR DB Table in. This would allow one to script the performance tests and see what changing those settings may entail. (replication etc)

  2. Provide table creation options (region size etc)

  3. Provide option for inmem true and false to see differences.

  4. Create docker file that includes all the things (MapR clients, Hbase -client, YCSB etc). Option to build and then push to local docker registry, and have then run options that allow for local running and the remote running by pulling docker image... I may try to power through this, it would be very helpful for many of these tests to add Dockerization to ensure consistency of environment dependancies.

Fix JDK find in cluster-audit.sh

Issue: The cluster audit reports that JDK is not installed, but it is. Here is the relevant error message from the audit logs:
ls: cannot access /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/jps: No such file or directory
JDK not installed

disk-test.sh "unexpected end of file"

The last update of disk-test.sh broke the script:

[root@ip-10-0-0-33 cluster-validation]# /root/cluster-validation/pre-install/disk-test.sh
/root/cluster-validation/pre-install/disk-test.sh: line 137: syntax error: unexpected end of file

Add check for LANG="en_US.UTF-8 in audit script

For non US countries, the install process requires to check for the locale being set to en_US.UTF-8 or else there are all kind of issues post install like not being able to login to MCS or non-english log messages that may cause issues for tech support.

should be easy to add it... i may send a PR later.

Check Java more thoroughly in cluster-audit.sh

Check for any/all Java RPMs.
sudo yum list installed *jdk* *java*
Check for openjdk-devel
Check for java bin using readlink
javapath=$(dirname $(readlink -f /usr/bin/java))
if [ -f $javapath/jps ]
[[ $PATH =~ $(dirname $(readlink -f /usr/bin/java)) ]] || PATH=$(dirname $(readlink -f /usr/bin/java)):$PATH

Please adjust network test for 80 Gbit bonded interfaces

Please adjust network test for 80 Gbit bonded interfaces.
It seems that the tests are optimized for 10 GBit interfaces. I was not able to verify 40 GBit or 80 GBit interfaces, because there seems to be a boundary at 10 GBit.

Failed dependencies: python is needed by clustershell-1.6-1.el6.noarch

I'm having trouble when trying to setup (as root) the cluster validation tools on a Ubuntu x86_64. I'm also having the same problem on MapR M3 on AWS EMR.

$ rpm -i clustershell-1.6-1.el6.noarch.rpm
rpm: RPM should not be used directly install RPM packages, use Alien instead!
rpm: However assuming you know what you are doing...
warning: clustershell-1.6-1.el6.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
error: Failed dependencies:
        /usr/bin/python is needed by clustershell-1.6-1.el6.noarch
        python(abi) = 2.6 is needed by clustershell-1.6-1.el6.noarch

When I use the flag --nodeps, the installation succeed and the folder /etc/clustershell/ is created but still have a problem:

$ clush -a date
Traceback (most recent call last):
  File "/usr/bin/clush", line 7, in <module>
    from ClusterShell.CLI.Clush import main

Any idea what went wrong and how to fix it?

List clush groups in cluster-audit.sh

nodeset -l
@ALL
@zk
@cldb
@rm
@hist

@HBM
@hbr

root@psnode40 zsh#0 cat /etc/clustershell/groups
all: psnode[40-44]
zk: psnode[40-42]
cldb: psnode[40-42]
rm: psnode43
hist: psnode44
hbm: psnode44
hbr: @ALL

root@psnode40 zsh#0 cat /etc/clustershell/groups.d/local.cfg

ClusterShell groups config local.cfg

all: example[4-6,32-159]

Can't test /dev/sdaa by disk-test.sh

  • STEPS TO REPRODUCE

OS Disk is /dev/sda.
Server has more than 27 disks.
So device name of last disk is /dev/sdaa.
Run disk-test.sh.

  • EXPECTED RESULTS

Command line used: /root/cluster-validation/pre-install/iozone -I -r 1M -s 4G -k 10 -+n -i 0 -i 1 -i 2 -f /dev/sdaa

  • ACTUAL RESULTS

Command line used: /root/cluster-validation/pre-install/iozone -I -r 1M -s 4G -k 10 -+n -i 0 -i 1 -i 2 -f a

tar up old log files in disk-test.sh

Use similar approach as in network-test.sh near top of script:

ssh $host 'files=$(ls *-{rpctest,iperf}.log 2>/dev/null); [ -n "$files" ] && { tar czf network-tests-$(date "+%Y-%m-%dT%H-%M%z").tgz $files; rm -f $files; }'

In cluster-audit.sh output, provide text to indicate problems.

The output produced provides information on the status of certain things, without any indication of whether it’s a problem. For example, on my cluster audit, I got this message, which is good (as it should be):

SElinux status: SELINUX=disabled
Disabled

And this message, which is bad (not as it should be):

Required RPMs:
package ntp is not installed

And this message, which is neutral:

mapr account for MapR Hadoop
mapr user NOT found!

(and the use of NOT in all caps could be taken to indicate this is a problem).

It would be nice if a setting that was problematic had a WARN: (or something) in front of it, so people could look for that keyword and make sure they’re not missing problems (or interpreting something as problematic when it’s not).

network-test.sh mods from cardlytics

tmpfile=$(mktemp); trap "rm $tmpfile; echo sigspec: $?; exit" EXIT

iplist+=( $(ssh $host hostname -i) ) #TBD: check for more than 1 IP address

[ -n "$DBG" ] && read -p "$DBG: Press enter to continue or ctrl-c to abort"

/etc/shadow check

Add check in cluster-audit.sh and mapr-audit.sh for 'mapr' access to /etc/shadow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.