Coder Social home page Coder Social logo

Comments (8)

ayujain04 avatar ayujain04 commented on July 24, 2024

Another possibility is: For each (drug, disease) pair of interest, extract the list of meta-paths automatically, as a CSV file?

from hetionet.

dhimmel avatar dhimmel commented on July 24, 2024

get a sum of the path scores of all metapaths from a source node to a target node

Summing all the path scores is not a metric we have explored. For reference from this manuscript:

The path score equals the proportion of the DWPC contributed by a path multiplied by the magnitude of the DWPC’s p-value (-log10(p)).

Therefore, if you wanted to sum path scores across all metapaths, you don't actually need to know the individual paths. You could sum the -log10(p-value) for each metapath. You could get p-values for metapaths whose significance exceeds the database inclusion threshold via API calls like https://search-api.het.io/v1/metapaths/source/17054/target/6602/ (this is what the webapp uses).

Do you think you are interested in all metapaths (up to a given length) or are some metapaths more interesting for you application?

from hetionet.

ayujain04 avatar ayujain04 commented on July 24, 2024

Okay, that makes sense. Thank you! I am interested in all the metapaths (up to a given length) that are significant enough. Would I be able to use that API call to do that?

from hetionet.

dhimmel avatar dhimmel commented on July 24, 2024

Yes, you will likely need to get the mapping of Neo4j internal identifiers to persistent disease/compound IDs. You can do that with this Cypher query at https://neo4j.het.io/browser/:

MATCH (node)
WHERE node:Compound OR node:Disease
RETURN id(node) AS id, node.identifier AS identifier, node.name AS name, labels(node)[0] AS type
ORDER BY type, identifier

Then you can use those ids for the API calls above. How many node pairs do you want to do this for? If its a very large number, you might be better off running the queries against the PostgreSQL database directly at search-db.het.io.

from hetionet.

ayujain04 avatar ayujain04 commented on July 24, 2024

Thanks for the response!

For reference: I am trying to build a disease specific hypergraph.

In order to do so, I will need each metapath from every drug in the graph to a specific disease node.

It would be helpful to be able to say query Metformin and Dementia and then get a csv of every metapath from metformin to dementia in a csv file.

Is this possible?

from hetionet.

ayujain04 avatar ayujain04 commented on July 24, 2024

For example in the sample API query that you provided, it provides a list of metapaths from source id node to target id node.

With one such path listed below. However, How can I get the ids of each node in each metapath. I will need this to construct the hypegraph, as each of these nodes will be in a single hyperedge betweeen one drug and one disease.

{
            "id": 72430549,
            "adjusted_p_value": 0.045993636486382335,
            "path_count": 126,
            "dwpc": 4.386227813718969,
            "p_value": 0.0003801126982345647,
            "reversed": false,
            "metapath_abbreviation": "CbGdAlD",
            "metapath_name": "Compound–binds–Gene–downregulates–Anatomy–localizes–Disease",
            "metapath_length": 3,
            "metapath_path_count_density": 0.590437,
            "metapath_path_count_mean": 4.04788,
            "metapath_path_count_max": 372,
            "metapath_dwpc_raw_mean": 0.000121205,
            "metapath_n_similar": 121,
            "metapath_p_threshold": 1.0,
            "metapath_id": "CbGdAlD",
            "metapath_reversed": false,
            "metapath_metaedges": [
                [
                    "Compound",
                    "Gene",
                    "binds",
                    "both"
                ],
                [
                    "Gene",
                    "Anatomy",
                    "downregulates",
                    "both"
                ],
                [
                    "Anatomy",
                    "Disease",
                    "localizes",
                    "both"
                ]
            ],
            "dgp_id": 21763190,
            "dgp_source_degree": 56,
            "dgp_target_degree": 39,
            "dgp_n_dwpcs": 800,
            "dgp_n_nonzero_dwpcs": 791,
            "dgp_nonzero_mean": 2.1106664913501048,
            "dgp_nonzero_sd": 0.5350022943256412,
            "dgp_reversed": false,
            "cypher_query": "MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:DOWNREGULATES_AdG]-(n2)-[:LOCALIZES_DlA]-(n3:Disease)\nUSING JOIN ON n1\nWHERE n0.identifier = 'DB00331' // Metformin\nAND n3.identifier = 'DOID:1612' // breast cancer\nWITH\n[\nsize((n0)-[:BINDS_CbG]-()),\nsize(()-[:BINDS_CbG]-(n1)),\nsize((n1)-[:DOWNREGULATES_AdG]-()),\nsize(()-[:DOWNREGULATES_AdG]-(n2)),\nsize((n2)-[:LOCALIZES_DlA]-()),\nsize(()-[:LOCALIZES_DlA]-(n3))\n] AS degrees, path\nWITH path, reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.5) AS PDP\nWITH collect({paths: path, PDPs: PDP}) AS data_maps, count(path) AS PC, sum(PDP) AS DWPC\nUNWIND data_maps AS data_map\nWITH data_map.paths AS path, data_map.PDPs AS PDP, PC, DWPC\nRETURN\n  path AS neo4j_path,\n  substring(reduce(s = '', node IN nodes(path)| s + '–' + node.name), 1) AS path,\n  PDP,\n  100 * (PDP / DWPC) AS percent_of_DWPC\nORDER BY percent_of_DWPC DESC\nLIMIT 10"
        },

from hetionet.

dhimmel avatar dhimmel commented on July 24, 2024

How can I get the ids of each node in each metapath

Terminology correction: metapaths contain metanodes (like Anatomy or Disease) rather than nodes. Actual paths are what contain nodes (e.g. Metformin & Dementia).

I'm not sure about the rest of the question, but search-api.het.io has an endpoint to get the paths for a given source node, target node, and metapath combination.

Regarding the JSON output from the API, a tool like pandas could help you convert it to CSV (CSV and JSON are just different encodings of data).

from hetionet.

ayujain04 avatar ayujain04 commented on July 24, 2024

Sure, so the JSON allowed me to view the meta paths, which is not that helpful for what I am trying to build.

The JSON also provided me with cypher queries from which the result of, if I could download it as a csv, would give me the path of the actual nodes (what I would need to construct this hypergraph).

However, because of the amount of queries, node4j times out. Is there an API that I can call that would return the actual paths?

from hetionet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.