delta-rho / install-emr Goto Github PK
View Code? Open in Web Editor NEWInstallation script and instructions for setting up Tessera environment on Amazon Elastic MapReduce
Installation script and instructions for setting up Tessera environment on Amazon Elastic MapReduce
Hello,
I got this error
gpg: directory /home/hadoop/.gnupg' created gpg: new configuration file
/home/hadoop/.gnupg/gpg.conf' created
gpg: WARNING: options in /home/hadoop/.gnupg/gpg.conf' are not yet active during this run gpg: keyring
/home/hadoop/.gnupg/secring.gpg' created
gpg: keyring /home/hadoop/.gnupg/pubring.gpg' created gpg: requesting key 381BA480 from hkp server pgpkeys.mit.edu gpg: /home/hadoop/.gnupg/trustdb.gpg: trustdb created gpg: key 381BA480: public key "Johannes Ranke (CRAN Debian archive) <[email protected]>" imported gpg: no ultimately trusted keys found gpg: Total number processed: 1 gpg: imported: 1 E: Version '3.1.0-1~squeezecran3.0' for 'r-base-core' was not found E: Version '3.1.0-1~squeezecran3.0' for 'r-base-dev' was not found chmod: cannot access
/usr/local/lib/R/site-library': No such file or directory
This from the script given by https://github.com/saptarshiguha/install-emr/blob/master/emr-2.4.2/install-r
(which is similar to that given in the main repo)
On a system with a proxy, the dig command doesn't work. Is this a better substitute?
curl -s http://whatismijnip.nl |cut -d " " -f 5
It would be good to make the package installation scripts a bit more user friendly. At a minimum, we should check whether the package installation failed or not and if possible provide a link to the log.
If it's possible to actually extract the stderr log and we could print that, this would be even better.
If the package is in a subdirectory or you want to install from a different branch / commit than master, you can't do it with the current script. It would be good to make this possible.
Might want to check if the cluster started correctly in
https://github.com/tesseradata/install-emr/blob/master/emr-3.2.1/tessera-emr.sh#L174
E.g. check for CLUSTER_ID non null
if [ -z "$CLUSTER_ID" ]; then
exit 11
fi
echo "Tagging cluster…"
aws emr add-tags --resource-id $CLUSTER_ID --tags $AWS_RES_TAG_KEY=$AWS_RES_TAG_VALUE
On the R console of the master node on typing library(Rhipe):
library(Rhipe)
Loading required package: codetools
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'Rhipe', details:
call: fun(libname, pkgname)
error: could not find function "handleIOFormats"
In addition: Warning message:
S3 methods ‘[.mapfile’, ‘[<-.mapfile’, ‘[[.mapfile’, ‘[[<-.mapfile’, ‘print.mapfile’ were declared in NAMESPACE but not found
Error: package or namespace load failed for ‘Rhipe’
I guess all the prerequisite packages have been packaged ...Or I have to install them ?
Hi,
I've been getting this warning whenever I run the command. I presume it has something to do with Ruby.
I am running Ruby V 2.1.3 on windows. Any idea?
Thank you
C:\Users\XXX\elastic-mapreduce-cli>ruby elastic-mapreduce ./elastic-mapreduce
--create --alive --name "RhipeCluster" --enable-debugging
Digest::Digest is deprecated; use Digest
Error: execution expired
C:\Users\XXX\elastic-mapreduce-cli>--num-instances 2 --slave-instance-type m1. large --master-instance-type m3.xlarge --ami-version "2.4.2"
'--num-instances' is not recognized as an internal or external command, operable program or batch file.
I am trying to setup Tessera environment on Amazon EMR. I have setup the AWS account. I installed the scripts by cloning the github respository. Then I ran tessera-emr.sh from my system.
Cluster is created. When I tried to logon to RStudio Server from web browser, credentials did not work. I used USER=tessera-user, PASSWD=tessera
On investigating, I found customJAR step failed in the cluster.
Details in customJAR:
status: Failed
ID:s-1FIU62D4Y3X95
JAR location: s3://elasticmapreduce/libs/script-runner/script-runner.jar
Main class: None
Arguments: s3:///scripts/post-install-config.sh tessera-user tessera
Action on failure: Continue
Details from controller log files
2016-08-18T13:44:29.739Z INFO redirectOutput to /mnt/var/log/hadoop/steps/s-1FIU62D4Y3X95/stdout
2016-08-18T13:44:29.739Z INFO redirectError to /mnt/var/log/hadoop/steps/s-1FIU62D4Y3X95/stderr
2016-08-18T13:44:29.739Z INFO Working dir /mnt/var/lib/hadoop/steps/s-1FIU62D4Y3X95
2016-08-18T13:44:29.781Z INFO ProcessRunner started child process 19973 :
hadoop 19973 2325 0 13:44 ? 00:00:00 bash /home/hadoop/bin/hadoop jar /mnt/var/lib/hadoop/steps/s-1FIU62D4Y3X95/script-runner.jar s3://dmar-emr/scripts/post-install-config.sh tessera-user tessera
2016-08-18T13:44:29.783Z INFO Synchronously wait child process to complete : hadoop jar /mnt/var/lib/hadoop/steps/s-1FIU62D4...
2016-08-18T13:44:35.785Z INFO waitProcessCompletion ended with exit code 1 : hadoop jar /mnt/var/lib/hadoop/steps/s-1FIU62D4...
2016-08-18T13:44:35.785Z INFO total process run time: 6 seconds
2016-08-18T13:44:35.984Z INFO Step created jobs:
2016-08-18T13:44:35.984Z WARN Step failed with exitCode 1 and took 6 seconds
There is an error in installRstudio in install-tessera-master.sh
It should be (addition of rhel)
wget http://download2.rstudio.org/rstudio-server-rhel-${ver}-x86_64.rpm
Regards
Saptarshi
tessera-emr.sh
has the ec2 key name hard coded. This needs to be an option for the user to specify.
I am running tessera-emr.sh to create the cluster on AWS. While running the script, I am getting below error. What could be the reason?
An error occurred (ValidationException) when calling the RunJobFlow operation: AMI version 3.11.0 is not supported.
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.