Coder Social home page Coder Social logo

data-preparation-for-ctd2's People

Contributors

allaway avatar bence-szalai avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

data-preparation-for-ctd2's Issues

s is not defined

For this chunk:

#for compunds where inhibitor,blocker,antagonist etc. are in the MoA columns
#we assume they are inhibitory compounds, so we will mark them with -1 in the meta_matrix
# for other compounds, we assume they are activators, we will mark them with +1
#this is probably not a perfect way to access inhibitory/acovatory state, but good for a first try
inhibitory_words=set(['inhibitor','blocker','antagonist','inihibitor']) #inihibitor is just a typo
for i in drug_metadata.index:
    if list(drug_metadata.index).index(i) % 100==0:
        print('Done for %i drugs' %list(drug_metadata.index).index(i))
    brd=drug_metadata.loc[i,'broad_id']
    if not pd.isnull(drug_metadata.loc[i,'moa']):
        moas=drug_metadata.loc[i,'moa'].split('|')
    else:
        moas=[]
    if not pd.isnull(drug_metadata.loc[i,'target']):
        s=1
        targets=drug_metadata.loc[i,'target'].split('|')
        if len(set((' '.join(moas)).split())&inhibitory_words)>0:
            s=-1
    else:
        targets=[]
    meta_matrix.loc[brd,moas]=1
    meta_matrix.loc[brd,targets]=s

I get the error

Done for 0 drugs
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-29-ff5ed4cb33f5> in <module>()
     20         targets=[]
     21     meta_matrix.loc[brd,moas]=1
---> 22     meta_matrix.loc[brd,targets]=s

NameError: name 's' is not defined

Apologies for all of the issues. I am an R person with only passing python familiarity...

Defining s in the final else statement seems to fix this but it's not clear to me whether this is an appropriate fix:

for i in drug_metadata.index:
    if list(drug_metadata.index).index(i) % 100==0:
        print('Done for %i drugs' %list(drug_metadata.index).index(i))
    brd=drug_metadata.loc[i,'broad_id']
    if not pd.isnull(drug_metadata.loc[i,'moa']):
        moas=drug_metadata.loc[i,'moa'].split('|')
    else:
        moas=[]
    if not pd.isnull(drug_metadata.loc[i,'target']):
        s=1
        targets=drug_metadata.loc[i,'target'].split('|')
        if len(set((' '.join(moas)).split())&inhibitory_words)>0:
            s=-1
    else:
        s=0
        targets=[]
    meta_matrix.loc[brd,moas]=1
    meta_matrix.loc[brd,targets]=s

Let me know what you think!

Chunk 7 error

For this Chunk (7) in the notebook, I hit the following error:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-7-929255b434da> in <module>()
      4 #cid is the sample ides, while rid is the gene ids
      5 expression=parse('../data/GSE92742/GSE92742_Broad_LINCS_Level5_COMPZ.MODZ_n473647x12328.gctx',
----> 6                  cid=samples,rid=gene_ids.index).data_df.T[gene_ids.index]
      7 expression.head()
      8 #so here the rows are the samples where the given genes were knocked down

/home/ec2-user/anaconda3/lib/python3.5/site-packages/cmapPy/pandasGEXpress/parse.py in parse(file_path, convert_neg_666, rid, cid, ridx, cidx, row_meta_only, col_meta_only, make_multiindex)
     60     elif file_path.endswith(".gctx"):
     61         curr = parse_gctx.parse(file_path, convert_neg_666, rid, cid, ridx, cidx, row_meta_only, col_meta_only,
---> 62                                 make_multiindex)
     63     else:
     64         err_msg = "File to parse must be .gct or .gctx!"

/home/ec2-user/anaconda3/lib/python3.5/site-packages/cmapPy/pandasGEXpress/parse_gctx.py in parse(gctx_file_path, convert_neg_666, rid, cid, ridx, cidx, row_meta_only, col_meta_only, make_multiindex)
    101 
    102         # validate optional input ids & get indexes to subset by
--> 103         (sorted_ridx, sorted_cidx) = check_and_order_id_inputs(rid, ridx, cid, cidx, row_meta, col_meta)
    104 
    105         data_dset = gctx_file[data_node]

/home/ec2-user/anaconda3/lib/python3.5/site-packages/cmapPy/pandasGEXpress/parse_gctx.py in check_and_order_id_inputs(rid, ridx, cid, cidx, row_meta_df, col_meta_df)
    137     (col_type, col_ids) = check_id_idx_exclusivity(cid, cidx)
    138 
--> 139     row_ids = check_and_convert_ids(row_type, row_ids, row_meta_df)
    140     ordered_ridx = get_ordered_idx(row_type, row_ids, row_meta_df)
    141 

/home/ec2-user/anaconda3/lib/python3.5/site-packages/cmapPy/pandasGEXpress/parse_gctx.py in check_and_convert_ids(id_type, id_list, meta_df)
    173         if id_type == "id":
    174             id_list = convert_ids_to_meta_type(id_list, meta_df)
--> 175             check_id_validity(id_list, meta_df)
    176         else:
    177             check_idx_validity(id_list, meta_df)

/home/ec2-user/anaconda3/lib/python3.5/site-packages/cmapPy/pandasGEXpress/parse_gctx.py in check_id_validity(id_list, meta_df)
    189             mismatch_ids)
    190         logger.error(msg)
--> 191         raise Exception("parse_gctx check_id_validity " + msg)
    192 
    193 

Exception: parse_gctx check_id_validity some of the ids being used to subset the data are not present in the metadata for the file being parsed - mismatch_ids:  {'10610', '7874', '4776', '1019', '256364', '8678', '9695', '4851', '5547', '9897', '9833', '2017', '7077', '64080', '5211', '8440', '5699', '3775', '7867', '5289', '6804', '9275', '9134', '79143', '9961', '11044', '10606', '958', '2264', '23131', '56997', '11031', '9688', '22887', '25874', '965', '2065', '4172', '2817', '79071', '1870', '80212', '57048', '54623', '80347', '22934', '8553', '1052', '7020', '1027', '9375', '960', '9221', '16', '3157', '142', '54499', '6195', '50865', '29763', '10123', '9133', '55111', '11232', '1277', '79961', '29978', '4860', '51203', '10057', '50814', '9928', '65123', '8312', '1514', '4836', '6304', '26064', '7398', '9170', '4067', '9053', '388650', '2954', '25966', '6117', '4482', '5927', '9552', '1017', '4331', '23325', '4313', '9097', '3909', '5110', '64422', '29890', '3162', '5716', '1956', '65057', '7319', '4927', '90861', '9801', '843', '5831', '10810', '51382', '51097', '4609', '670', '27032', '26036', '84722', '8985', '3122', '22883', '26511', '10221', '54681', '10206', '30836', '94239', '27244', '147179', '994', '9650', '7750', '10298', '9455', '54386', '2956', '23588', '30001', '51116', '9903', '79170', '23536', '596', '780', '808', '51599', '7168', '10845', '27242', '23212', '5058', '2146', '60528', '332', '9702', '7159', '29916', '22827', '10494', '868', '11098', '7538', '55608', '2736', '9448', '3566', '79094', '3098', '11230', '8884', '4232', '25839', '25932', '23300', '10190', '55127', '54807', '5440', '2961', '55746', '1029', '8349', '3251', '4846', '5331', '23326', '79080', '4312', '5359', '7416', '9761', '6659', '355', '7849', '2058', '9813', '10904', '2852', '5641', '11014', '2597', '1861', '8727', '11073', '10682', '22796', '3300', '3303', '5154', '8573', '11168', '3597', '55011', '5373', '5613', '5607', '58497', '6347', '51021', '9641', '27336', '51742', '55256', '51466', '1399', '3895', '2896', '9276', '664', '5921', '8091', '6709', '23244', '57761', '51070', '1635', '10276', '26136', '6182', '5050', '51071', '5236', '9686', '5743', '3978', '3486', '695', '2886', '5710', '4638', '5829', '3202', '7994', '8270', '26292', '4794', '50810', '200081', '51465', '10150', '9019', '4792', '5347', '29083', '1676', '5900', '1647', '54733', '5899', '9917', '10921', '387', '8731', '6714', '55604', '84890', '9854', '51495', '8835', '5982', '10559', '23061', '10013', '25825', '1212', '55620', '998', '10320', '891', '1605', '2673', '10892', '10051', '54505', '8550', '23659', '7264', '226', '323', '1994', '3315', '5290', '622', '5261', '10775', '3964', '28969', '3329', '56889', '10489', '4208', '10732', '3028', '4016', '1465', '23636', '23378', '57019', '6616', '2542', '10180', '79006', '9217', '29103', '5480', '8508', '5788', '24149', '572', '9124', '5720', '55033', '5108', '2548', '5111', '23029', '6342', '1050', '10112', '23499', '10973', '79716', '7043', '54442', '9112', '63933', '2309', '1738', '8624', '58533', '11319', '6990', '55847', '211', '51015', '4125', '6622', '6193', '26001', '2288', '701', '57215', '11011', '23670', '5257', '8520', '200734', '840', '9143', '9712', '55324', '10513', '9126', '672', '11041', '1111', '2274', '1026', '6790', '1398', '993', '5601', '7165', '178', '51056', '5054', '10972', '10953', '4088', '116832', '5925', '51375', '5883', '10695', '896', '2582', '9703', '25793', '79921', '58472', '3930', '5498', '5873', '54881', '823', '66008', '6839', '8870', '2184', '55148', '5096', '10318', '847', '1616', '7158', '9653', '644', '23161', '63874', '10765', '23200', '6251', '9261', '8851', '9533', '2887', '5467', '5255', '10165', '10818', '89910', '22794', '310', '10898', '51053', '23038', '26227', '8202', '1062', '51635', '9697', '23014', '5048', '6832', '10523', '51031', '902', '10049', '9016', '10557', '6676', '8204', '351', '60493', '8518', '1500', '84159', '10525', '1759', '1958', '10668', '9670', '10857', '4605', '5525', '3988', '1001', '533', '8396', '10273', '3628', '29928', '392', '8837', '11284', '83743', '9267', '5300', '890', '55129', '8878', '10617', '11151', '10915', '11072', '5468', '6009', '8444', '6390', '2523', '4282', '23597', '6119', '6275', '1153', '1385', '4775', '2353', '51005', '1848', '6772', '10962', '55837', '8503', '9709', '8895', '9212', '9842', '51422', '2185', '79174', '8050', '57192', '8318', '3066', '9817', '9517', '10270', '10007', '1123', '7296', '595', '5529', '6461', '6777', '8804', '5721', '22889', '9710', '831', '56654', '4154', '10227', '9488', '51293', '3508', '9851', '27109', '8324', '8869', '23365', '54438', '10398', '10146', '2063', '55012', '91137', '8321', '6919', '2690', '5566', '2553', '2064', '3108', '665', '79073', '1846', '79090', '11065', '1459', '1452', '4582', '1454', '4931', '3416', '6944', '64943', '7157', '9181', '1070', '230', '6774', '10954', '10174', '6696', '51026', '26020', '23530', '9915', '6812', '2048', '3383', '2745', '9805', '8480', '4783', '79947', '2624', '6443', '23224', '1950', '3551', '3925', '6856', '57178', '4998', '2946', '3611', '11188', '8446', '5019', '6509', '81544', '6464', '2356', '51160', '64429', '64746', '6813', '4864', '23039', '10046', '11325', '207', '30', '10641', '6915', '10670', '2778', '23076', '2769', '6194', '4791', '51569', '10730', '10493', '8826', '7082', '9868', '6284', '7105', '56940', '2263', '2770', '23142', '8720', '3033', '23', '7074', '7690', '11004', '47', '5696', '4043', '8821', '6499', '5427', '6909', '55818', '6253', '5583', '8800', '9519', '4891', '10058', '9738', '79850', '3308', '56924', '5909', '637', '23210', '2920', '4893', '26128', '23585', '25987', '6850', '23512', '7905', '9847', '1802', '5880', '80758', '23097', '10589', '976', '6894', '23443', '54957', '899', '7494', '9128', '7376', '55793', '2042', '11200', '23410', '7088', '11142', '9246', '23635', '5891', '4690', '54205', '873', '9467', '10059', '1978', '23011', '7852', '93487', '5092', '55825', '6597', '1662', '55958', '51024', '10237', '80746', '10245', '5993', '4780', '9797', '1891', '1677', '10776', '3553', '427', '26054', '148022', '4144', '22809', '29082', '874', '3091', '9988', '4651', '7866', '25803', '11344', '1022', '2767', '7016', '23335', '5287', '2222', '1786', '23321', '1666', '10362', '5867', '5997', '8974', '7358', '23271', '6599', '128', '5985', '2908', '5827', '54850', '501', '64781', '6657', '10038', '5223', '991', '51335', '329', '22905', '9924', '22926', '64428', '7048', '30849', '5889', '5971', '55556', '7485', '51719', '55008', '2115', '29911', '5438', '6908', '3682', '58478', '51282', '54512', '1643', '8243', '51001', '5366', '1633', '10681', '8569', '5770', '835', '1534', '10099', '6050', '8574', '57406', '4925', '3280', '5106', '6500', '3206', '10644', '11007', '813', '2958', '22841', '9183', '7982', '79643', '4616', '5782', '23658', '7027', '11261', '3815', '23386', '6697', '6118', '8061', '8726', '2109', '652', '5654', '53343', '55038', '5321', '4793', '11157', '3480', '375346', '1445', '5796', '6793', '2771', '10797', '983', '10782', '22908', '9943', '55179', '39', '10434', '9637', '80349', '10652', '23149', '55699', '9289', '4216', '466', '84617', '6988', '5603', '1829', '2037', '10813', '836', '29937', '54915', '25', '3693', '10450', '22823', '1021', '5715', '5627', '3482', '3725', '8996', '3156', '727', '57147', '23463', '2195', '5708', '6603', '23223', '4817', '1509', '3454', '4303', '2810', '80204', '5747', '9270', '9918', '291', '10153', '23368', '3385', '1981', '27346', '9093', '55748', '5792', '1831', '23139', '5986', '5580', '4200', '581', '55893', '1983', '54541', '949', '2131', '57804', '124583', '9926', '57149', '85377', '8607', '2625', '642', '11182', '1213', '2113', '5836', '25976', '9690', '7015', '79600', '26520', '102', '7106', '23013', '10969', '3337', '10131', '9531', '9491', '6184', '27095', '5777', '3312', '85236', '25805', '23338', '11137', '3800', '481', '8914', '2534', '10451', '1282', '1906', '7099', '81533', '348', '7511', '79902', '3638', '10329', '11000', '5355', '5423', '7466', '5588', '1845', '1429', '5018', '91949', '1788', '23522', '7153', '5898', '3398', '8900', '4850', '51170', '79187', '154', '5357', '23077', '23647', '10491', '6810', '93594', '10285', '26993', '51385', '50813'}```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.