esri / geoprocessing-tools-for-hadoop Goto Github PK
View Code? Open in Web Editor NEWThe Hadoop GP Toolbox provides tools to exchange features between a Geodatabase and Hadoop and run Hadoop workflow jobs.
License: Apache License 2.0
The Hadoop GP Toolbox provides tools to exchange features between a Geodatabase and Hadoop and run Hadoop workflow jobs.
License: Apache License 2.0
In shapefiles with text attributes that include the percent sign (%
), JSONUtil.py fails for both Enclosed and Unenclosed Features to JSON options.
I'd submit a pull request if only I was better at python.
Here are two examples and the resulting errors for both enclosed and unenclosed JSON:
secure.firstenergycorp.com/servlet/com.firstenergycorp.webobjects.Engine;jsessionid=NILS2LQAABJWHLA1AAP1LPI?s=com.firstenergycorp.www.Home&o=43304514&q=3&p=%2FContact+Us
Enclosed output:
Executing: FeaturesToJSON iou_terr D:\GIS_DATA\platts\enc_float.json ENCLOSED_JSON FORMATTED
Start Time: Fri Mar 06 11:39:38 2015
Running script FeaturesToJSON...
Traceback (most recent call last):
File "<string>", line 363, in execute
File "D:\Code\geoprocessing-tools-for-hadoop\JSONUtil.py", line 368, in ConvertFC2JSON
geometry_str = unicode(row[len(row) - 1]) if pjson != True else unicode(json.dumps(json.loads(row[len(row) - 1]), indent=4))
TypeError: a float is required
Failed to execute (FeaturesToJSON).
Failed at Fri Mar 06 11:39:39 2015 (Elapsed Time: 0.97 seconds)
Unenclosed output:
Executing: FeaturesToJSON iou_terr D:\GIS_DATA\platts\iou_unenc_some_bad.json UNENCLOSED_JSON FORMATTED
Start Time: Fri Mar 06 11:24:03 2015
Running script FeaturesToJSON...
Traceback (most recent call last):
File "<string>", line 365, in execute
File "D:\Code\geoprocessing-tools-for-hadoop\JSONUtil.py", line 422, in ConvertFC2JSONUnenclosed
attributes_json.clear()
TypeError: a float is required
Failed to execute (FeaturesToJSON).
Failed at Fri Mar 06 11:24:04 2015 (Elapsed Time: 0.93 seconds)
"www.heco.com/CDA/default/0,1999,TCID%253D8%2526CCID%253D0%2526LCID%253D0%2526CTYP%253DARTC,00.html"
Enclosed output:
Executing: FeaturesToJSON iou_terr D:\GIS_DATA\platts\enc_format_char.json ENCLOSED_JSON FORMATTED
Start Time: Fri Mar 06 11:40:48 2015
Running script FeaturesToJSON...
Traceback (most recent call last):
File "<string>", line 363, in execute
File "D:\Code\geoprocessing-tools-for-hadoop\JSONUtil.py", line 368, in ConvertFC2JSON
geometry_str = unicode(row[len(row) - 1]) if pjson != True else unicode(json.dumps(json.loads(row[len(row) - 1]), indent=4))
ValueError: unsupported format character 'D' (0x44) at index 261
Failed to execute (FeaturesToJSON).
Failed at Fri Mar 06 11:40:52 2015 (Elapsed Time: 3.92 seconds)
Unenclosed output:
Executing: FeaturesToJSON iou_terr D:\GIS_DATA\platts\iou_unenc_some_bad.json UNENCLOSED_JSON FORMATTED
Start Time: Fri Mar 06 11:25:06 2015
Running script FeaturesToJSON...
Traceback (most recent call last):
File "<string>", line 365, in execute
File "D:\Code\geoprocessing-tools-for-hadoop\JSONUtil.py", line 422, in ConvertFC2JSONUnenclosed
attributes_json.clear()
ValueError: unsupported format character 'D' (0x44) at index 261
Failed to execute (FeaturesToJSON).
Failed at Fri Mar 06 11:25:09 2015 (Elapsed Time: 3.88 seconds)
I'm just going to null out those attribute values.
As a user, I want to clone this one repository, and have everything needed to add the toolbox to my ArcGIS system.
Hi all,
I'm having troubles when importing JSON table from HDFS to ArcMap 10.3 (English language package).
The table is succesfully generated from earthquake.csv file from sample by:
CREATE TABLE agg_samp(point binary)
ROW FORMAT SERDE 'com.esri.hadoop.hive.serde.JsonSerde'
STORED AS INPUTFORMAT 'com.esri.json.hadoop.UnenclosedJsonInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
INSERT OVERWRITE TABLE agg_samp SELECT ST_Point(latitude, longitude) FROM earthquakes_new;
When importing it by the set up:
I already checked Issue #22 but couldn't find a solution.
Can you, please, help me?
Thanks in advance.
Referencing Esri/gis-tools-for-hadoop#20
Idea: complementary to #5, or in the interim, evaluate Oozie Workflow Generator Tool for mention in the documentation of the Execute Workflow GP tool.
Background:
The documentation for Execute Workflow mentions the need for workflow XML but does not contain, nor link to, information on how to get or make the workflow XML.
The tutorials cover the other four tools but not Execute Workflow.
The trip-discovery blog article mentions Execute Workflow only in passing, opting instead for command-line invocation.
There are examples with the point-in-polygon and trip-discovery source code - is this easy enough to find?
A job-execution tool based on HCatalog instead of Oozie is proposed for easier usage. Effort might better be invested there, unless there is something that can be done substantially faster with documentation of the Oozie-based Execute Workflow tool.
Hello,
We are experimenting with Hadoop and ArcGIS. We have downloaded the Apache Hadoop software (1.2.1). We can install it on either a CentOS or Ubuntu Linux box. Which will work better for us (classroom examples of ArcGIS/Hadoop) and which version of the OS. Thank you.
Does "JSON to Features" only support Point-Features?
I allways get an error: "sequence size must match size of the row" when trying to convert a polygon.
I created the json with the "Features to JSON" tool with a very simple polygon featureclass.
The json-file is attached, (just rename the txt to json).
I am export feature to JSON.
I want to output is "decimal degree" but it is "Meter".
How did i do ?
Thank you
1、down the geoprocessing-tools-for-hadoop-master.zip and unzip;
2、add the HadoopTools.pyt to the box;
but all of the tools show a problem: and not to use?
the python error
Traceback (most recent call last):
File "", line 205, in getParameterInfo
File "d:\arcgis\arcgisinfo10.2\desktop10.2\arcpy\arcpy\arcobjects\mixins.py", line 286, in init
setattr(self, attrib, attribvalue)
File "d:\arcgis\arcgisinfo10.2\desktop10.2\arcpy\arcpy\arcobjects_base.py", line 89, in _set
return setattr(self._arc_object, attr_name, cval(val))
ValueError: ParameterObject: DataType \u5c5e\u6027\u7684\u8f93\u5165\u503c\u65e0\u6548
Following along in spark at https://github.com/geoHeil/spatial-heatmaps/tree/master/esri
the JSON serde is not found
ClassNotFoundException: Class com.esri.hadoop.hive.serde.JsonSerde not found
even though:
"com.esri.hadoop" % "spatial-sdk-hive" % esriVersion,
"com.esri.hadoop" % "spatial-sdk-json" % esriVersion,
i.e. the current master branch (2.1.0-SNAPSHOT) are on the class path. Am I missing a dependency? The base JSON Serde would be available but is not called.
The issue here probably makes more sense than at Esri/gis-tools-for-hadoop#65
ArcMap 10.2.2, English
Windows 7
Error when adding the Hadoop toolbox
Users md5sum matches working version on my computer (e68151f010f1c4908f18eabe91200e25 *HadoopTools.pyt)
Issue first reported on GeoNet
Does the tool "copy from HDFS" communicate only via the namenode port, which is usually 50070?
Or can it use other ports like from datanodes or zookeeper?
Additional question: If the customer is not sure which port his namenode (HDFS TCP port number) is configured, how could he find out which port to use?
It is a common occurence that users encounter issues with network configuration - such as etc/hosts
or firewall - when trying to use the Geoprocessing Tools for Hadoop. An obscure error message such as getaddrinfo failed
leads to the first impression that there are software bugs in the GP Tools. We should present more informative and helpful error messages, that instruct the user to investigate network configuration, rather than giving up or filing a bug report before checking the network. Cross-reference the multiple GIS-Tools-for-Hadoop issues that arose out of network issues:
Esri/gis-tools-for-hadoop#22
Esri/gis-tools-for-hadoop#16
Esri/gis-tools-for-hadoop#14
Hi,
I created a Model and and ran it as a tool. From the results, I tried to publish it to my server as a geoprocesing service so that I can consume it from webappbuilder. But it failed.
I have 10.3.1 server, 10.3.1 desktop and Hortonworks 2.3.2 VM running locally.
Is this GP service capability supported?
A customer wants to use mainly KNOX to communicate with his Hadoop System.
As far as I see this GP-Tools cann't use KNOX.
Am I right?
Are there plans to implement this?
Hi,
I am having problem related to NULL values on date column after querying in HIVE.
Procedure
Our aim is to transfer the feature classes from a geodatabase and copy it to HDFS using hadoop toolbox for esri. So the steps are we create a json file using features to json and then create a table in hive using the create . Then we copy it to hdfs and then start querying on hive to retrieve the field values.
Problems I am facing
I am having issue with the date fields as mentioned in the screenshots. While creating a table I tried using date as the data type in one create statement and string as the data type in another create statement .
In the former statement after copying to hdfs and while querying, i am unable to retrieve any values and it shows Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.DateWritable cannot be cast to org.apache.hadoop.io.Text.
In the latter statement when I used string as the datatype, it shows some random values as expected and when i tried to typecast the date column as date I am getting null values.
I have attached the two screenshots . Kindly help me with this issue as I am not able to proceed further in my attempt to migrating the feature classes to hadoop
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.