Comments (8)
Note that this is stepping into over-engineering territory. As of right now, there is no real duplication to warrant this extra abstraction but as we implement new tasks we will find if this is worth
from lightning-flash.
A better solution is to do:
class DataPipeline:
def __init__(self, collate: CollatePipeline, uncollate: UncollatePipeline):
self.collate = collate
self.uncollate = uncollate
def before_collate(self, ...):
self.collate.before_collate(...)
...
from lightning-flash.
collate_fn is really coming from the dataset to process new raw_data
To be precise, before_collate
is the one to process new raw_data
. Then you have collate
to handle batching and after_collate
for any batch processing
So the first one does have some degree of conflict with the dataset, but the second two do not.
About the proposal
I generally like the idea. I can see us having to duplicate collate logic between tasks with different uncollate logic.
But doing it with mixins might grow to be confusing. See this example:
class A:
def a(self):
print('a')
class B:
def b(self):
print('b')
class C(A, B):
...
x = C()
x.a() # a
x.b() # b
# great!
class D:
def a(self):
print('d')
class C(D, B): ...
x = C()
x.a() # d
x.b() # b
# great!
class E:
def a(self):
print('e')
# If they subclass C, now order matters
class F(E, C):
...
class G(C, E):
...
x = F()
x.a() # e
x.b() # b
x = G()
x.a() # d eek!
x.b() # b
so yeah... if this grows it will become a nightmare to follow
from lightning-flash.
Yes, and let's create the default for each data-type and data-type task.
from lightning-flash.
So people are just left to implement uncollate_fn
from lightning-flash.
Users should be able to modify the preprocessing step (on the GPU preferably) in after the dataloading/batching and before the model execution.
There should be a overridable "batch preprocessing" function defined in the datapipeline that is called unconditionally before the model when running it for either training or inference or maybe split by training or inference
from lightning-flash.
Adding @carmocca and @kaushikb11 as reviewers!
from lightning-flash.
@tchaton DataPipeline was already merge. Can we close this ?
from lightning-flash.
Related Issues (20)
- Instance Segmentation Example Broken HOT 3
- Issue with `ImageClassificationData.from_dataset` HOT 1
- Remove or replace the active learning loop example
- `SemanticSegmentationData` - zero-size array to reduction operation maximum which has no identity HOT 1
- Support for files stored on Google drives.
- Support generation kwargs within Seq2SeqTasks
- Error when importing flash.video in v0.8.1 HOT 2
- `ObjectDetectionData.from_coco`: `transform_kwargs` and `image_size error` HOT 2
- Getting error while using backbones and head code HOT 1
- Object detection example broken HOT 6
- `ObjectDetectionData.from_images` raise an error
- ImportError: cannot import name 'Labels' from 'flash.core.classification' HOT 1
- ModuleNotFoundError with lightning-flash[image] and ImageEmbedder HOT 8
- apply_func has been moved, need to update import HOT 5
- Protobuf requirements too strict HOT 5
- `ImageClassificationDataFrameInput` object has no attribute `target_formatter` HOT 1
- Flash Trainer not working - No module named 'pytorch_lightning.utilities.apply_func' HOT 1
- `download_data` from `flash.core.data.utils` connects to the internet before checking if a file exists HOT 1
- No module named 'pytorch_lightning.utilities.apply_func' HOT 4
- SemanticSegmentation has no available heads or backbones in FlashRegistry
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning-flash.