Coder Social home page Coder Social logo

szhorizon / spatialhadoop2 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aseldawy/spatialhadoop2

0.0 1.0 0.0 22.83 MB

The second generation of SpatialHadoop that ships as an extension

License: Other

Shell 0.03% PigLatin 0.60% Ruby 1.49% Java 92.33% JavaScript 1.94% HTML 3.26% CSS 0.34%

spatialhadoop2's Introduction

IMPORTANT NOTICE

SpatialHadoop is no longer maintained. We recommend that you use our most recent system Beast which is built for Spark and provides the following improvements over SpatialHadoop.

  • More file formats: Beast supports many standard formats including Esri Shapefile, CSV, GeoJSON, and GPX.
  • Multidimensional data: Beast inherently supports multidimensional data so it can index and process 3D or higher dimensions.
  • Better indexes: Beast adopts new indexes including R*-Grove, the most advanced index for big spatial data.
  • Higher performance: Thanks to the advanced features in Spark.
  • Easier to use: You can run Beast without any installation on-top of your existing Spark installation.
  • Scalable visualization: Visualize terabytes of data on an interactive map. Beast powers UCR-Star the home of terabytes of public geospatial data.
  • Raptor = Raster + Vector: Beast hosts a novel component for processing raster and vector data concurrently. It is orders of magnitude faster than competetive systems.
  • Active: Beast is still active in development and new features are regularly added to it.

SpatialHadoop

SpatialHadoop is an extension to Hadoop that provides efficient processing of spatial data using MapReduce. It provides spatial data types to be used in MapReduce jobs including point, rectangle and polygon. It also adds low level spatial indexes in HDFS such as Grid file, R-tree and R+-tree. Some new InputFormats and RecordReaders are also provided to allow reading and processing spatial indexes efficiently in MapReduce jobs. SpatialHadoop also ships with a bunch of spatial operations that are implemented as efficient MapReduce jobs which access spatial indexes using the new components. Developers can implement myriad spatial operations that run efficiently using spatial indexes.

How it works

SpatialHadoop is used in the same way as Hadoop. Data files are first loaded into HDFS and indexed using the index command which builds a spatial index of your choice over the input file. Once the file is indexed, you can execute any of the spatial operations provided in SpatialHadoop such as range query, k-nearest neighbor and spatial join. New operations are added occasionally to SpatialHadoop such as polygon union and convex hull.

Install

SpatialHadoop is packaged as a single jar file which contains all the required classes to run. All operations including building the index can be accessed through this jar file and it gets automatically distributed to all slave nodes by the Hadoop framework. In addition the spatial-site.xml configuration file needs to be placed in the conf directory of your Hadoop installation. This allows you to configure the cluster accordingly.

Examples

Here are a few examples of how to use SpatialHadoop.

Generate a non-indexed spatial file with rectangles in a rectangular area of 1M x 1M

shadoop generate test.rects size:1.gb shape:rect mbr:0,0,1000000,1000000 

Build a grid index over the generated file

shadoop index test.rects sindex:grid test.grid shape:rect

Run a range query that selects rectangles overlapping the query area defined by the box with the two corners (10, 20) and (2000, 3000). Results are stored in the output file rangequery.out

shadoop rangequery test.grid rect:10,10,2000,3000 rangequery.out shape:rect

Compile

Advanced users and contributors might like to compile SpatialHadoop on their own machines. SpatialHadoop can be compiled via Maven. First, you need to grab your own version of the source code. You can do this through git. The source code resides on github. To clone the repository, run the following command

git clone https://github.com/aseldawy/spatialhadoop2.git

If you do not want to use git, you can still download it as a single archive provided by github.

Once you downloaded the source code, you need to make sure you have Any and Ivy installed on your system. Please check the installation guide of Maven if you do not have it installed.

To compile SpatialHadoop, navigate to the source code and run the command:

mvn compile

This will automatically retrieve all dependencies and compile the source code.

To build a redistribution package, run the command:

mvn assembly:assembly

This Maven command will package all classes of SpatialHadoop along with the dependent jars not included in Hadoop into an archive. This archive can be used to install SpatialHadoop on any existing Hadoop cluster.

spatialhadoop2's People

Contributors

cj343 avatar sghos006 avatar kareemtarek avatar aseldawy avatar aymandf avatar stev-0 avatar xyfigo avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.