Comments (3)
Can you give details about the "Graal" plans?
Work processor provides transformation method:
WorkProcessor#transform
Let's suppose that you have chain of Page
transformations, e.g:
WorkProcessor<Page> processor1 = ...;
WorkProcessor<Page> processor2 = processor1.transform(transformation1);
WorkProcessor<Page> processor3 = processor2.transform(transformation2);
...
One can observe that we can compile such chain of Page
transformation into a tight loop that doesn't materialize intermediate results. Please checkout paper: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf and project: https://hyper-db.de/.
In order to generate such tight loop one can extend WorkProcessor#transform
so that it can generate optimized bytedcode (using existing airlift bytecode framework), e.g:
static <Page, Page> WorkProcessor<Page> transform(
WorkProcessor<Page> processor,
Transformation<Page, Page> transformation)
{
...
if (transformation instanceof BytecodeRowTransformation) {
// generate tight loop
} else {
// proceed with intermediate pages materialization
}
}
interface BytecodeRowTransformation extends Transformation<Page, Page> {
BytecodeExpression generateTransformation(BytecodeTransformationContext context);
}
interface BytecodeTransformationContext {
..
// transformation result bytecode
BytecodeExpression needsMoreData();
BytecodeExpression producedResult()
..
// input row channels getter bytecode
BytecodeExpression getChannel(int channel);
BytecodeExpression isNull(int channel);
..
// output row channel bytecode setters
void defineChannel(int channel, Supplier<BytecodeExpression> definition);
void defineIsNull(int channel, Supplier<BytecodeExpression> definition);
..
}
BytecodeRowTransformation#generateTransformation
would generate bytecode of transformation (using BytecodeTransformationContext
to consume input/produce output within generated code).
However generating bytecode is really cumbersome and error prone. Truffle/Graal provides a nice abstraction for creating highly performant interpreters which we could also utilize to generate maintainable and readable WorkProcessor
transformations (tutorial on using Truffle: http://cesquivias.github.io/blog/2014/12/02/writing-a-language-in-truffle-part-2-using-truffle-and-graal/). In such case we won't be using BytecodeExpression
but much more friendlier classes and annotations mixed with normal type-safe Java code, e.g:
interface TruffleRowTransformation extends Transformation<Page, Page> {
TruffleNode generateTransformation(TruffleTransformationContext context);
}
interface TruffleTransformationContext {
..
// similar methods as in BytecodeTransformationContext, but using truffle node classes
}
Some notes:
WorkProcessor
transformations are functional, so one could actually create a language interpreter for them, e.g:
transform(
transform(
processor,
context -> python transformation),
context -> java transformation)
- Truffle/Graal and
WorkProcessor
abstraction enables us to use other languages for transformations (e.g: Python). For instance we could implement table functions where such functions are written in non-Java languages, but are JITed into tight loop with Java code.
This is just a draft and I still need to play more with Truffle/Graal in order to obtain more details.
from trino.
We could also make it more type friendly, e.g:
interface TruffleRowTransformation {
TruffleNode generateTransformation(TruffleTransformationContext context);
}
// lazily compiles truffle transformations if any `WorkProcessor`
// interface method is called.
interface TruffleWorkProcessor extends WorkProcessor<Page> {
TruffleWorkProcessor transform(TruffleTransformation transformation);
}
interface TruffleWorkProcessorFactory {
// if `workProcessor` is already TruffleWorkProcessor then return it,
// so compilation can be extended further
TruffleWorkProcessor toTruffleWorkProcessor(WorkProcessor<Page> workProcessor);
}
Then usage is:
WorkProcessor<Page> result = workProcessor
.transformProcessor(truffleWorkProcessorFactory#toTruffleWorkProcessor)
.transform(truffleTransformation1)
.transform(truffleTransformation2)
// this automatically compiles previously stacked truffle transformations
.transform(nonTruffleTransformation);
Question: How to cache compilations. Should TuffleNodes
be comparable?
from trino.
Thank you @sopel39 for leading this. This is tremendous!
from trino.
Related Issues (20)
- Major issue: After changing the table field type from bigint to decimal (28,5), the data in the historical partition table cannot be queried HOT 1
- Flaky TestHiveTransactionalTable.testBucketedUnpartitionedDelete HOT 2
- Trino has a bug of Merge Statement from Iceberg Catalog HOT 2
- Fix join pushdown in SQL server connector
- How to find unsed table/view in trino
- Some Redshift tests are broken
- Test `TestDistributedEngineOnlyQueries.testAssignUniqueId` fails: incorrect results HOT 1
- Planning is not deterministic HOT 3
- java.lang.NoClassDefFoundError: io/trino/plugin/base/metrics/LongCount HOT 2
- Flaky test `TestHiveConnectorTest.testCreateTableWithEmptyBucketsAndCompressionCodec`: "Target directory for table already exists" HOT 1
- Table has exceeded max number of active streams error happens in BigQuery connector
- "Table has exceeded max number of active streams" error happens in BigQuery connector
- Deployment trino-server-445 NullPointerException HOT 1
- Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient()
- Unable to insert a large volume of data into S3 bucket
- Pinot Connector Single Quote Escape Issue
- Function with LOOP fails with "Compiler failed [...] break target does not exist" when run on worker HOT 2
- [Trino JDBC] NoClassDefFoundError io/opentelemetry/semconv/SemanticAttributes HOT 1
- create table parquet external_location no values HOT 2
- Flaky test TestMongoConnectorTest.testInsertRowConcurrently
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trino.