Coder Social home page Coder Social logo

Preserve Global IDs about forklift HOT 11 CLOSED

stdavis avatar stdavis commented on August 24, 2024
Preserve Global IDs

from forklift.

Comments (11)

stdavis avatar stdavis commented on August 24, 2024 2

I think that there is a possibility of making this work for roads at least in the short term. I think that we should add some checking to forklift for global id fields and the required index. Then apply the preserve ids env var for each append individually. This wouldn't fix it for stuff that's updated via the insert cursor, but that definitely doesn't apply to roads or anything else coming out of internal.

I did verify that global ids are preserved when forklift pushes the data to AGOL by verifying that they match between features from the forklift staging FGDBs and the corresponding hosted feature layer in AGOL.

I'll work on a PR to make this happen.

from forklift.

stdavis avatar stdavis commented on August 24, 2024 1

Could we calculate a field with those values as a string or something to preserve them for now?

I think that could be a good fallback but I think that I've come up with a solution. I'm about to submit a PR.

from forklift.

gregbunce avatar gregbunce commented on August 24, 2024

Yes, I do use it when pulling data from Internal into fgdb and it works as advertised - globalids are preserved..

thanks for getting this going. i think preserving these IDs is a huge step in getting us to the next step with data collaboration.

from forklift.

stdavis avatar stdavis commented on August 24, 2024

Looks like we may run into issues if the source dataset doesn't have the appropriate index on the global id field. Here's my test:

import arcpy

dest = r'W:\forklift\samples\data\sampledestination.gdb\Counties'
source = r'W:\forklift\samples\data\internal.sde\SGID.BOUNDARIES.Counties'

arcpy.env.preserveGlobalIds = True

print('truncating')
arcpy.TruncateTable_management(dest)

print('appending')
arcpy.Append_management(source, dest, 'NO_TEST')

which threw this error:

truncating
appending
Traceback (most recent call last):
  File "forklift\test.py", line 12, in <module>
    arcpy.Append_management(source, dest, 'NO_TEST')
  File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\management.py", line 6153, in Append
    raise e
  File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\management.py", line 6150, in Append
    retval = convertArcObjectToPythonObject(gp.Append_management(*gp_fixargs((inputs, target, schema_type, field_mapping, subtype, expression), True)))
  File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\geoprocessing\_base.py", line 512, in <lambda>
    return lambda *args: val(*gp_fixargs(args, True))
arcgisscripting.ExecuteError: ERROR 003340: The target dataset must have a GlobalID field with a unique index in order to use the Preserve GlobalIDs geoprocessing environment setting.
Failed to execute (Append).

Maybe a try/catch that searches for the error number and retries the append without the env variable set?

I also realized that forklift only uses truncate table when it's wired up to change detection. For everything else, it uses update and insert cursors so I ran this test:

import arcpy

dest = r'W:\forklift\samples\data\sampledestination.gdb\Counties'
source = r'W:\forklift\samples\data\internal.sde\SGID.BOUNDARIES.Counties'

arcpy.env.preserveGlobalIds = True

print('truncating')
arcpy.TruncateTable_management(dest)

print('insert cursor')
fields = [f.name for f in arcpy.ListFields(source) if not f.name.startswith('Shape')] + ['Shape@']
source_rows = arcpy.da.SearchCursor(source, fields)
with arcpy.da.InsertCursor(dest, fields) as cursor:
  for row in source_rows:
    cursor.insertRow(row)

source_rows.reset()
print(next(source_rows))
print(next(arcpy.da.SearchCursor(dest, fields)))

which didn't throw any errors, but the global ids were not preserved:

truncating
insert cursor
(12, '03', 2010031010.0, 2010.0, 'CACHE', 5.0, 'North', 112656, 128289, '{AD3015BE-B3C9-4316-B8DC-03AFBB56B443}', '49005', 2, <Polygon object at 0x294afcc1908[0x294a738fb70]>)
(1, '03', 2010031010.0, 2010.0, 'CACHE', 5.0, 'North', 112656, 128289, '{87545C8A-D34D-43E1-A336-7285E4D0B194}', '49005', 2, <Polygon object at 0x294afd61648[0x294af944f90]>)

from forklift.

gregbunce avatar gregbunce commented on August 24, 2024

that makes sense. speed bump. i wonder if we iterate through all the layers in Internal where GlobalID has no unique index and then assign a unique index to that field.

also, should we ensuring all layers in Internal have a GlobalID?

from forklift.

steveoh avatar steveoh commented on August 24, 2024

If you did that the truncate and load update methods would keep creating larger and larger ids. I think some of our work flows would need to change.

from forklift.

gregbunce avatar gregbunce commented on August 24, 2024

creating larger objectids?

from forklift.

steveoh avatar steveoh commented on August 24, 2024

creating larger objectids?

Global ids. Which kind of makes them useless right?

from forklift.

gregbunce avatar gregbunce commented on August 24, 2024

yeah, i see your point: if the data is updated in Internal via truncate and load then the globalids continually get recreated, making it only useful within the current update cycle.

hmm... trying to tie this idea of consistent globalids across datasets (Internal, AGOL/OpenData, fgdb downloads, etc) into your chalkdust remake as well as promoting more collaboration/feedback with our users. ...using a stable id to keep us all on the same page as to what feature we're all talking about.

for the roads data, the globalid works b/c I'll be preserving the globalids in Internal based on what's in UTRANS, where they are generated. this will allow us to have a consistent and stable globalid across Utrans, Internal, and the NG911 datasets. This globalid is used in the NG911 error reporting and it helps tie the feedback/fixes back to the source utrans data. It would be nice if the roads globalid was consistent in AGOL/OpenData as well. again, thinking about feedback and collaboration in the AGOL/OpenData world... users can recommend changes/fixes based and we can tie it back to the utrans segment based on the globalid.

longwinded, I know. but, your point is taken. aside from the roads data, the current update process of truncate and load kind of throws a monkey wrench into this for most other datasets.

Given, it would still be nice to preserve the ids (keep them consistent between Internal and AGOL/OpenData) for the roads dataset as those will remain stable.

to make that happen, i guess that means we'd have to ensure that all existing data in Internal that has a GlobalID field has a unique index on it, right? that would prevent the preserve ids environmental variable from erroring, right (assuming the roads data is updated in AGOL/OpenData via the append method @stdavis described above - and not the insert cursor method)?

from forklift.

stdavis avatar stdavis commented on August 24, 2024

We are running into this same issue in the WRI forklift instance. Global IDs are not being preserved in the Range Trends data (SiteInfo). This is a problem because they are used to link to the rand trends site (e.g. https://dwrapps.utah.gov/rangetrend/rtstart?SiteID=%7B3E045EEF-FB32-47FD-8BCD-8BB4501F58E6%7D).

This instance does not use change detection.

I wonder if the fix is to apply the same logic in change detection:

if (crate.source_describe['hasGlobalID']):
log.info(f'deleting and copying {crate.destination}')
with arcpy.EnvManager(
geographicTransformations=crate.geographic_transformation,
preserveGlobalIds=True,
outputCoordinateSystem=crate.destination_coordinate_system
):
arcpy.management.Delete(crate.destination)
#: the only way to preserve global id values when exporting to fgdb is to use these tools
if crate.is_table():
arcpy.conversion.TableToTable(crate.source, crate.destination_workspace, crate.destination_name)
else:
arcpy.conversion.FeatureClassToFeatureClass(crate.source, crate.destination_workspace, crate.destination_name)

to the core forklift update:

if changes.has_changes():

from forklift.

steveoh avatar steveoh commented on August 24, 2024

Could we calculate a field with those values as a string or something to preserve them for now?

from forklift.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.