Coder Social home page Coder Social logo

Comments (8)

AkshitaB avatar AkshitaB commented on May 23, 2024

@Alexixu Possibly your usecase be solved by handling empty instances in the (custom) DatasetReader you're using. If not, please share more details on what dataset reader you are running this with.

from allennlp.

Alexixu avatar Alexixu commented on May 23, 2024

@AkshitaB The dataset reader is custom class inherit from DatasetReader. Empty instances is ok if the data loader can handle this empty logic. Discarding empty instance is the direct way to do so. But the default implement of DataLoader has no such logic. In my view, throwing an Exception is more suitable for corrupt example and DataLoader catch this Exception and discard examples.

from allennlp.

dirkgr avatar dirkgr commented on May 23, 2024

@Alexixu, you can do this in the DatasetReader if you override how _read() works. You can return something like None from DatasetReader.text_to_instance(), and then do the right thing in _read().

from allennlp.

Alexixu avatar Alexixu commented on May 23, 2024

@dirkgr I have tried this, but the default DataLoader implement can not handle None object, And it will throw an Exception of "None type has no index function".

I suggest it should be handled in an obvious way by Defining a concrete Exception and adding a try catch logic in DataLoader implement.

from allennlp.

dirkgr avatar dirkgr commented on May 23, 2024

What I'm saying is, you can change this behavior in your own DatasetReader, where you override the _read() method to throw away the None objects.

from allennlp.

github-actions avatar github-actions commented on May 23, 2024

This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread πŸ‘‡

from allennlp.

Alexixu avatar Alexixu commented on May 23, 2024

@dirkgr I have tried that exactly, by implement _read function return None object. But the Data Loader (not the Dataset Reader) which call the text_to_instance function can not handle None object.

from allennlp.

dirkgr avatar dirkgr commented on May 23, 2024

The _read() function should not return None. The _read() function is where you detect None and throw it away (instead of returning it).

Think of it this way: From _read() you have to return an iterable of instances. AllenNLP does not care how you do this. It only cares that _read() returns an iterable of instances. So you can do whatever you want inside of _read(), including skipping instances.

from allennlp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.