Comments (4)
Let us discuss this @navinrathore
from zingg.
We can have two files blockingFunctions.txt and similarityFubctions.txt. Each contains multiple functions as sql
blocking file has name of function followed by input column type and output column type and Sql. The from clauses have same name as name of the data in config
NameFn: string, string
Select name from test;
similarly other functions
similarity functions always return double but
from zingg.
--blockingFunctions, --similarityFunctions with location to file
from zingg.
- all work to be done in the new branch.
- take flags --blockingFunctions, --simlarityFunctions in ClientOptions and Arguments and Client
- parse and set the right value in above
- define the yml for blocking functions, also take care of linking etc (Sonal)
- define blocking function interface (Sonal)
- figure out which lib etc to use for parsing the yml, possibly Jackson
- read the yml and build the blocking functions as per the interface
- register the hash functions in the registry
- build blocking tree(Sonal)
- test(Sonal/Navin)
from zingg.
Related Issues (20)
- docker 0.4.0 - ERROR LBFGS HOT 2
- TypeError: 'JavaPackage' object is not callable on posit cluster HOT 2
- Not able to see output using some of the examples HOT 10
- Cannot read config.json in s3 when deployed to EMR HOT 4
- unnecessary messages in the listener
- Error when running DataBricks Example file HOT 6
- z_minScore 0 value HOT 4
- Azure synapse compatibility HOT 1
- `exportModel` encounters `NullPointerException` HOT 2
- Match Type NULL_OR_BLANK causing zingg.block.Block NPE HOT 70
- Is there a way to pre-train a brand new model? e.g. `Jack == John`; `Joe-Bob == Alexander`; `id 123 == id 456` HOT 1
- In place of `fieldDefinitions`, support avro schema, which is a more comprehensive way to describe data HOT 1
- Support for other feature types in non string fields HOT 4
- Merge Strategy in Zingg AI HOT 2
- Pipe does not need to be generic
- TypeError: 'JavaPackage' object is not callable when calling args = Arguments() HOT 4
- Databricks Error - Py4JJavaError: An error occurred while calling o964.execute. HOT 11
- Pairs against two data frames HOT 3
- 0 positive pairs when i had one HOT 22
- household table as per new design
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zingg.