Coder Social home page Coder Social logo

arvkevi / clinvar-kaggle Goto Github PK

View Code? Open in Web Editor NEW
11.0 4.0 8.0 47.37 MB

Scripts used to generate the ClinVar conflicting classifications dataset on Kaggle

Home Page: https://www.kaggle.com/kevinarvai/clinvar-conflicting

License: MIT License

Python 100.00%
machine-learning kaggle-dataset bioinformatics genomics kaggle

clinvar-kaggle's Introduction

Hello, ๐Ÿ‘‹ my name is Kevin Arvai. I'm a data scientist with 10+ years of experience in the genomics field.

Connect ๐Ÿค
Since you're here, let me know you stopped by. Share your Python or data science story with me on Twitter or LinkedIn. I love hearing about what people are working on in the open-source community!

Favorite project ๐Ÿงฌ
I wrote an app that predicts users' ancestry from their genetic data.

Non-GitHub stuff ๐Ÿ’ป
I like machine learning, open-source software/data, and genomics.
My Real Python articles, blog posts, and Kaggle profile.

clinvar-kaggle's People

Contributors

arvkevi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

clinvar-kaggle's Issues

Data Description

Can I know the features description since most of it abbreviations? what does it refer to, if there is a link of the data description please share it

License File Missing

Hello Kevin,

My name is Ryan Ulaszek and I work at Amazon as a Solution Architect in Life Sciences. Would you be willing to add a license file to the repo? It looks like the Kaggle project is under Create Commons but this repo doesn't have the Creative Commons license file. We think the work you have done is a great example of machine learning in genomics and would like to build an AWS Solution based on it. We will of course point back to your GitHub repo and Kaggle site to give you credit. We would like to use your dataset and the process_clinvar.py file.

Thanks,

Ryan

Vawk and Gawk dependencies

Hello Kevin,

Is there an easy way to break the dependency on Vawk and Gawk? We can easily add zips of python packages to Glue jobs but binaries are a challenge and Gawk is a binary. Can you think of an easy way to rework the process_clinvar.py script with only python dependencies?

Thanks,

Ryan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.