Coder Social home page Coder Social logo

spark's Introduction

SPARK CHEATPAGE

HDFS (Hadoop Distributed File System)

It replicate the data on 3 DataNodes
DataNode directories

Commands:

Remove trash

hdfs dfs -du -h <path>
hdfs dfs -rm -skipTrash <path>
hdfs dfs -get  /user/hdfs/warehouse/Ericsson/LTE/CTR/hourly/rlret/customer=bouygues/*  /tmp/parquet/

Ambari

hdfs dfsadmin -report
systemctl status ambari-agent 
systemctl status postgresql
sudo ambari-server restart
Logs: /var/log/ambari-server/ambari-server.out

Restoring all up

hdfs dfsadmin -report
systemctl status ambari-agent
If is all down on ambari:  
1) Start HDF
2) If there is not active NameNode, run the following command from NameNode1 server ONLY IT IS REQUIRED (run it on sparkmaster): 
sudo -u hdfs hdfs haadmin  -ns nameservice1 -transitionToActive --forceactive nn1 --forcemanual  
sudo -u hdfs hdfs haadmin -ns nameservice1 -getServiceState nn2
standby
sudo -u hdfs hdfs haadmin -ns nameservice1 -getServiceState nn1
Active

HUE WAREHOUSE:

Once data is processed, goes to warehouse directory and the files are parquet, files can not visualize clicking, but with impala tables we can see it.
Unprocessed data must be in hdfs/raw and then it will be moved to warehouse.

YARN (Yet Another Resource Negotiator)

YARN enables multiple data access applications to process it.
Resource Manager keeps the meta info about which jobs are running on which Node Manage and how much memory and CPU is consumed and hence has a holistic view of total CPU and RAM consumption of the whole cluster.
UI: Here we see the running jobs

YARN UI

YARN UI

AMBARI

To manage all the modules
AMBARI_UI

spark's People

Contributors

m0sc0 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.