Comments (19)
How many positive and negative pairs have you labeled? The logs say "All labels are the same value and fitIntercept=true, so the coefficients will be zeros. Training is not needed." which means that there are possibly no positive pairs?
from zingg.
+-------+---------+------+-----------+---------------------+--------------+--------+--------------+--------+-------------+---------+------------+----------+----------+-------------+-----------+---+------+--------+--------+----+----+-----------+------------+-----------+--------+
|fname |TALUKNAME|GPNAME|VILLAGENAME|SCHEMENAME |SERIESYEARNAME|BENFCODE|SPOUSENAME |BENFNAME|HUSBAND |CASTENAME|RELIGIONNAME|GENDERNAME|OCCUPATION|TOTALFAMILYNO|ADDRESS |AGE|AADHAR|VERIFIED|PHONENUM|VC |VI |VRATIONCARD|NEWACCOUNTNO|GOLDENRECNO|z_source|
+-------+---------+------+-----------+---------------------+--------------+--------+--------------+--------+-------------+---------+------------+----------+----------+-------------+-----------+---+------+--------+--------+----+----+-----------+------------+-----------+--------+
|RAJAMMA|Mysuru |Varuna|Chikkahalli|Basava Housing Scheme|2015 2016 |265342 |CHIKKAMUTTAIAH|rajam ma|cikkamuttayya|SC |Hindu |Female |Labour |3 |CHIKKAHALLI|36 |null |No |null |null|null|null |85034963709 |5910841 |test |
|RAJAMMA|Mysuru |Varuna|Chikkahalli|Basava Housing Scheme|2015 2016 |265342 |CHIKKAMUTTAIAH|rajam ma|cikkamuttayya|SC |Hindu |Female |Labour |3 |CHIKKAHALLI|36 |null |No |null |null|null|null |85034963709 |5910841 |test |
+-------+---------+------+-----------+---------------------+--------------+--------+--------------+--------+-------------+---------+------------+----------+----------+-------------+-----------+---+------+--------+--------+----+----+-----------+------------+-----------+--------+
Please select from the following choices
No, they do not match : 0
Yes, they match : 1
Not sure : 2
To exit : 9
Please enter your choice [0,1,2 or 9]: 1
Record pair 11 out of 22 records to be labelled by the user.
Zingg predicts the records ARE NOT KNOWN IF MATCH with a similarity score of 0.00
from zingg.
sorry I did not understand your last comment
from zingg.
Above is the positive pair. but issue message says that there are possibly no positive pair.
and not producing the trained records in tmp folder.
from zingg.
How many positive pairs do you have? You need at least 20-30 pairs for the classifier to get trained on
from zingg.
Record pair 21 out of 22 records to be labelled by the user.
Zingg predicts the records ARE NOT KNOWN IF MATCH with a similarity score of 0.00
What is the score ? If record
No, they do not match
or
Yes, they match means?
from zingg.
sorry, I do not understand the question above. In the initial round, Zingg does not give any score to the pairs. Once you label a few pairs, Zingg learns similarities and starts refining the matching predictions and scores. Does that help?
from zingg.
Can you please refer this.
Record not matching
Please select from the following choices
No, they do not match : 0
Yes, they match : 1
Not sure : 2
To exit : 9
Please enter your choice [0,1,2 or 9]: 0
Record pair 17 out of 20 records to be labelled by the user.
Zingg predicts the records MATCH with a similarity **score of 1.00**
Record Matching:-
Please select from the following choices
No, they do not match : 0
Yes, they match : 1
Not sure : 2
To exit : 9
Please enter your choice [0,1,2 or 9]: 1
Record pair 18 out of 20 records to be labelled by the user.
Zingg predicts the records DO NOT MATCH with a similarity **score of 0.27**
Please select from the following choices
No, they do not match : 0
Yes, they match : 1
Not sure : 2
To exit : 9
Please enter your choice [0,1,2 or 9]: 1
Record pair 19 out of 20 records to be labelled by the user.
Zingg predicts the records MATCH with a similarity **score of 1.00**
Here I've given 2 scenarios, 1st one record doesn't match 2nd & 3rd one records are matching. But score's are showing different.
Do you understand?
from zingg.
The labelling phase scores are indicative and get refined as you add more samples. How many pairs have you labelled so far? How many positives?
from zingg.
Out of 21, 15 Positives
Positive means same record?
from zingg.
Yes, positive means they are same. How many total rounds have you run? Have you labelled 15 pairs as yes out of total 21 or is it in this round?
from zingg.
labelled 15 pairs as yes out of total 21.
from zingg.
Ok, that’s pretty good to find so many matches in the first round of labelling itself. Do a few more rounds till you have about 30-49 matches and then try trainMatch
from zingg.
Yes, will do that..
from zingg.
what is the length of header name?
I've given this length.
MEMBER_OCCUPATION_ID_DESCRIPTION
from zingg.
Is there a problem?
from zingg.
didn't accept the "_" underscore
from zingg.
oh, looks like a bug, opened #50 for that. thanks for reporting. please add any other observations related to that issue there.
from zingg.
Is the training issue sorted @premsmac2021 - if so, can you please close this issue or let me know?
from zingg.
Related Issues (20)
- Cannot read config.json in s3 when deployed to EMR HOT 4
- unnecessary messages in the listener
- Error when running DataBricks Example file HOT 6
- z_minScore 0 value HOT 4
- Azure synapse compatibility HOT 1
- `exportModel` encounters `NullPointerException` HOT 2
- Match Type NULL_OR_BLANK causing zingg.block.Block NPE HOT 70
- Is there a way to pre-train a brand new model? e.g. `Jack == John`; `Joe-Bob == Alexander`; `id 123 == id 456` HOT 1
- In place of `fieldDefinitions`, support avro schema, which is a more comprehensive way to describe data HOT 1
- Support for other feature types in non string fields HOT 4
- Merge Strategy in Zingg AI HOT 2
- Pipe does not need to be generic
- TypeError: 'JavaPackage' object is not callable when calling args = Arguments() HOT 4
- Databricks Error - Py4JJavaError: An error occurred while calling o964.execute. HOT 11
- Pairs against two data frames HOT 3
- 0 positive pairs when i had one HOT 22
- household table as per new design
- selectedcols methods are duplicated HOT 2
- Code refactor for Named
- AWS S3 page in documentation is not visible
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zingg.