This project is based on django-react-boilerplate, offering various configurations and setups to expedite Django+React stack development.
- Django: Backend framework
- React: Frontend library
- Webpack: Bundler for static assets
- Material-UI: Provides useful UI components, including tables
- Mongo: Stores dataframe process metadata, offering flexibility in schema (Note: Ensure long-term schema consistency)
- Minio: Object storage compatible with Amazon S3, used for storing dataframe files
- Make: Simplifies setup with useful command aliases
- Docker: Allows easy deployment via docker-compose
- Navigate to the project root directory.
- Run
make docker_setup
. - Run
make docker_up
. - Access MINIO webpage at localhost:9001.
- Use Mongo client of your chioce, and connect to URL mongodb://localhost:27017. Then create a new Mongo database named
dataframe_cleaner
with collectiondataframe_metadata
. - Access localhost:8000 to use Data Cleaner.
These suggestions aim to enhance the project for production, some of which were deferred due to time constraints:
- Parameterize constants like category thresholds for user customization.
- Implement version rollback and processing from specific versions.
- Enable pagination for dataframe API to facilitate viewing large datasets.
- Implement user authentication and authorization.
- Provide detailed explanations in the UI about decision processes for user clarity.
- The current approach of fetching progress status involves constant pooling every 1 second. This is not ideal as it creates unnecessary connections to the backend. It would be better to use Long pooling, Server-Sent Events (SSE) or Websockets.
- Implement robust transaction handling across data sources.
- Modularize the dataframe processing app for better backend isolation.
- Set up MINIO retention policies for efficient disk space management.
- Segregate servers for user requests and data processing to facilitate load scaling.
- Consider email notifications for users upon completion of data processing for better user experience with large datasets.
- Transition to TypeScript for improved type consistency.
- Integrate Redux for better state management.
- Add a proper styling to frontend
- Develop Postman test scripts.
- Conduct integration tests for both backend and frontend.
- Perform load testing on dataframe processing.