Coder Social home page Coder Social logo

Comments (15)

sheinbergon avatar sheinbergon commented on June 4, 2024

@blikij These gis extensions are mere top level extensions on top of dremio. ST_Intersects for example, just deserializes two arrow buffers to JTS Geometry instances and does some operation on them.

Where I'm getting at, is that it's very hard for these functions to crash a Dremio node/cluster on their own. grpc_message:"ExecutionSetupException: One or more nodes lost connectivity during query might indicate an OOM error for the node, meaning you need to finetune the heap settings.

Note that if you truly have a 2GB polygon (what you set the field size limit to), running multiple GIS functions in the same pass several times will result in a huge burden on the JVM heap, as you'll be deserialziing the geometry back and forth multiple times. It's not related to the GIS extension themselves, but rather to the way query planners work, and to the size of a single datum

Having said that, I will download the sample datasets and try to recreate the issue myself

A few questions:

  • Which version of Dremio are you using?
  • Which version of this library are you using?
  • What JVM heap/GC settings are you using?
  • What's the deployment setting? Single Node? Cluster? On-Prem? AWS?

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

@blikij notice you've uploaded the same files twice under a different name (Cdda.zip and Nuts.zip contains Nuts parquet files), so I cannot recreate the issue. Please provide the correct files

from dremio-udf-gis.

blikij avatar blikij commented on June 4, 2024

Hmm odd. I must have been tired :-) I add a link to my one drive because it is bigger than 25MB (835MB).
https://eea1-my.sharepoint.com/:u:/g/personal/jan_bliki_eea_europa_eu/EdyUusPxDgtAmHk_QLNZUhEBQpyYkVqwpP9wiiTCnbYXcA?e=HMYHAg

No we don't have 2GB polygons in this example. But we have some exceeding the 64KB for sure.

We use the following Dremio version
Build
23.1.0-202211250136090978-a79618c7
Edition
Community Edition
Build Time
11/25/2022 01:47:03
Change Hash
a79618c7d0b7fbee5b56899176e3a396830dfeff
Change Time
11/24/2022 21:16:47

I think we use the last version of Library. But would need to ask my system administrator. And I don't know about the JVM heap and GC settings but will ask Monday.

We use on-prem with one coordinator, one master coordinator and three executers. Al test (virtual servers) have 8CPU's and 8GB ram allocated and 20GB of disk allocation.

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

Hmm odd. I must have been tired :-) I add a link to my one drive because it is bigger than 25MB (835MB). https://eea1-my.sharepoint.com/:u:/g/personal/jan_bliki_eea_europa_eu/EdyUusPxDgtAmHk_QLNZUhEBQpyYkVqwpP9wiiTCnbYXcA?e=HMYHAg

I can't access that link (requires a microsoft account). Please provide a fully public URL.

from dremio-udf-gis.

blikij avatar blikij commented on June 4, 2024

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

Nope. I don't have a microsoft account

from dremio-udf-gis.

blikij avatar blikij commented on June 4, 2024

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

Working now, 10x. Will check locally and revert back to you

from dremio-udf-gis.

blikij avatar blikij commented on June 4, 2024

the settings on both workers and coordinators are:   
uintx ErgoHeapSizeLimit                         = 0                                   {product}    
uintx HeapSizePerGCThread                       = 87241520                            {product}    
uintx InitialHeapSize                          := 526385152                           {product}    
uintx LargePageHeapSizeThreshold                = 134217728                           {product}    
uintx MaxHeapSize                              := 8392802304                          {product}

openjdk version "1.8.0_352"

Dremio version: 23.1.0

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

10x, I will revert back to you in a day or so

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

@blikij I ran the query locally (several times). it works OK, and as expected (I got it to run in 3:45 minutes)

Before we dive into optimization tips and ideas and cover more general concepts, I want to be sure I understand what you're aiming to do here:

SELECT 
    cdda.siteName,
    cdda.cddaRegionCode,
    nuts.NUTS_ID
FROM "Local"."files"."Cdda" as cdda
LEFT JOIN "Local"."files"."Nuts" as nuts
ON LENGTH(nuts.NUTS_ID) >4 AND ST_Intersects(cdda.__bbox,nuts.__bbox)
WHERE ST_Intersects(nuts.geo_value,cdda.geo_value)

So you are filtering on Geometry intersections and join on its bounding_box intersection... what's meant to be the real use case here? It sounds like we should just be doing one intersection tests instead of two, correct? (Just JOIN and to geo_value)

The reason I wish to verify your intent, is because the first place to optimizie is the query itself.

from dremio-udf-gis.

blikij avatar blikij commented on June 4, 2024

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

Got it, that indeed makes sense, and I can confirm the query behaves accordingly

So, let's move forward to diagnose the problem.
Looking at the spec you provided, I think you are indeed encountering an OOM from the executor nodes themselves.

You mentioned each machine has 8 GB of RAM, 8 VCPUs and 20GB of storage. However, you are also sizing your max heap to be 8GB (8392802304). Given that Dremio utilizies a lot of native (direct) memory OOTB, and you are probably running using the default G1GC, you can easily reach OOM.

The GIS extensions cannot crash the Dremio app, as they are mere UDF functions. However they do cause it to utilize more heap memory, as my implementation relies on heap memory for intersection processing.

Here are my recommendations, and how I currently run Dremio engines successfully:

  1. Move Dremio to JDK11 (you are using JDK 8, and 11 is perfectly supported by Dremio). This requires changing JAVA_HOME and pointing it to a JDK 11 installation. I recommend Azul or Correto.
  2. Move to ShenandoahGC. Dremio uses G1 by default which is really unperformant in modern standards.
  3. Use compact heuristic mode, this would prevent to JVM inflating upon garbage accumulation
  4. Set ConcGCThreads to 4. This query never utilizes more than 2 cores (on a single machine), so the rest of the cores can be used to make GC Cycles run more swiftly. You can try running with this value set to 2 and see if you encounter notable differeces.
  5. Size the Max heap properly. Max heap should be no more than 4GB, given that you have only 8GB of RAM available.

Using these settings, I got this query to run in ~3:30 minutes (which is great, given that you're are aware there are no indices in Dremio, and that you're effectually doing 269M intersection matches) . Observing the process RSS while it runs, it never cross 4GB, so you should be fine with you given memory spec, as long as you don't oversize the heap

I'm embedding my dremio-env so you can look how I tuned it:

#
# Copyright (C) 2017-2019 Dremio Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#
# Dremio environment variables used by Dremio daemon
#

#
# Directory where Dremio logs are written
# Default to $DREMIO_HOME/log
#
#DREMIO_LOG_DIR=${DREMIO_HOME}/log

#
# Send logs to console and not to log files. The DREMIO_LOG_DIR is ignored if set.
#
#DREMIO_LOG_TO_CONSOLE=1

#
# Directory where Dremio pidfiles are written
# Default to $DREMIO_HOME/run
#
#DREMIO_PID_DIR=${DREMIO_HOME}/run

#
# Max total memory size (in MB) for the Dremio process
#
# If not set, default to using max heap and max direct.
#
# If both max heap and max direct are set, this is not used
# If one is set, the other is calculated as difference
# of max memory and the one that is set.
#
#DREMIO_MAX_MEMORY_SIZE_MB=

#
# Max heap memory size (in MB) for the Dremio process
#
# Default to 4096 for server
#
DREMIO_MAX_HEAP_MEMORY_SIZE_MB=4096

#
# Max direct memory size (in MB) for the Dremio process
#
# Default to 8192 for server
#
DREMIO_MAX_DIRECT_MEMORY_SIZE_MB=8192

#
# Max permanent generation memory size (in MB) for the Dremio process
# (Only used for Java 7)
#
# Default to 512 for server
#
#DREMIO_MAX_PERMGEN_MEMORY_SIZE_MB=512

#
# Garbage collection logging is enabled by default. Set the following
# parameter to "no" to disable garbage collection logging.
#
#DREMIO_GC_LOGS_ENABLED="yes"

#
# Send GC logs to console and not to log files. The DREMIO_LOG_DIR is ignored if set.
# Default is set to "no"
#
#DREMIO_GC_LOG_TO_CONSOLE="no"

#
# By default G1GC is used as java garbage collection.
# This can be overriden by changing this parameter
#
DREMIO_GC_OPTS="-XX:+UseShenandoahGC"

#
# Java version will be checked by default.
# Currently only java 8 is supported by dremio.
# This check can be disabled by changing value to false.
#
#DREMIO_JAVA_VERSION_CHECK="true"

#
# The scheduling priority for the server
#
# Default to 0
#
# DREMIO_NICENESS=0
#

#
# Number of seconds after which the server is killed forcibly it it hasn't stopped
#
# Default to 120
#
#DREMIO_STOP_TIMEOUT=120

# Extra Java options - shared between dremio and dremio-admin commands
#
DREMIO_JAVA_EXTRA_OPTS="-Xms4096m -XX:ShenandoahGCHeuristics=compact -XX:ConcGCThreads=4"

# Extra Java options - client only (dremio-admin command)
#
#DREMIO_JAVA_CLIENT_EXTRA_OPTS=

# Extra Java options - server only (dremio command)
#
#DREMIO_JAVA_SERVER_EXTRA_OPTS=

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

@blikij thay you have a chance to test these settings?

from dremio-udf-gis.

sheinbergon avatar sheinbergon commented on June 4, 2024

Closing, as this is not a bug in the library, but rather an execution configuration issue

from dremio-udf-gis.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.