Coder Social home page Coder Social logo

Comments (2)

srinify avatar srinify commented on July 26, 2024

Hi there @ankurpuri1981

The SDV does a best guess effort during automatic metadata detection for types and table relationships and then provides convenience methods for updating the metadata to help you tweak and customize it. We've found this approach the best way to balance reducing friction (with best guess automatic metadata detection) with giving users this transparency and control over their metadata, ensuring higher quality synthetic data.

The sdtype is set to Unknown when SDV can't cleanly assign a better sdtype and these fields are treated as PII fields (or personal identifiable information).

  • If the field contains freeform text (like a vehicle's description), then there isn't a clear matching sdtype in that case because the SDV doesn't have an sdtype dedicated to arbitrary text.
  • In other cases, the text field might represent domain-specific concepts like social security numbers, IP addresses, street addresses, or license plate numbers. In which case, I'd recommend updating the field to the relevant sdtype. You can see a list of supported PII sdtypes here.
  • In other cases, the field may actually be a numerical field but maybe has some extraneous characters and can't cleanly be cast to a numerical field. I'd recommend cleaning your data and then manually setting this column's sdtype!

It looks like you've already found the metadata updating methods, but I'm also linking here as well so you have them handy: https://docs.sdv.dev/sdv/multi-table-data/data-preparation/multi-table-metadata-api#update-api

Out of curiosity, where does your source data live that you're trying to feed into the SDV? A database? An API end point? Flat files in a file store?

from sdv.

srinify avatar srinify commented on July 26, 2024

Hi there @ankurpuri1981 I hope my response was useful! I haven't heard from you in 2 weeks so I'm going to move forward with closing this issue out.

Feel free to open a new issue if you have more questions!

from sdv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.