Hi Dr. Wang, I'm a surgeon in China. I'm really interested in your SurvTrace and i

Hi there, You can refer to <a href="https://github.com/RyanWangZf/SurvTRACE/bl

Thanks a lot!! It's really helpful for me 😀 <p dir="au

Thanks a lot!! It's really helpful for me 😀 </blockquot

How to prepare model input from my own data?,about ryanwangzf/survtrace

RyanWangZf commented on August 30, 2024

Hi there,
You can refer to
https://github.com/RyanWangZf/SurvTRACE/blob/main/data/seer_processed.csv
for the standard input format of the data.

After your data is formatted as that, you can refer to
https://github.com/RyanWangZf/SurvTRACE/blob/main/survtrace/dataset.py

especially the condition under

elif data == "seer":

to set the PATH_DATA, event_list, cols_categorical , cols_standardize, config['num_event'] to be fit to your data.

Moreover, since this survtrace method is built based on transformers, we need a GPU device with like RTX 3060 or sth to train it efficiently.

Feel free to reach out if there is a further question.😀

from survtrace.

Jwenyi commented on August 30, 2024

Thanks for your response!! I'll try it :)

from survtrace.

Jwenyi commented on August 30, 2024

Hi Zifeng,
I've done my own survtrace model as your suggestion, thanks!! However, I have a new question that "how to predict the prognosis of a patients/sample?"
I used to train Cox regression model (a simple statistical model) or XGBoost that could provide a predicted score (or a survival function value?) for each patients, so we could use these scores to stratify patients. Thus, I wonder if any way to provide a prediction for each sample and output a dataframe or matrix that includes these prediction? Or, how we use survtrace to assign a predicted score for each patient?
P.S. I modeled survtrace without competing risk, patients only has one event "death or alive".

from survtrace.

RyanWangZf commented on August 30, 2024

Hi, you can use these four functions to get the predicted hazard/risk/survival rate for patients.

On

SurvTRACE/survtrace/model.py

Line 277 in 0d40f37

def predict_hazard(self, input_ids, batch_size=None):

and below. It outputs hazard/risk/survival rate on each discrete time point corresponding to the time horizons we set

SurvTRACE/survtrace/config.py

Line 7 in 0d40f37

    
           'horizons': [.25, .5, .75], # the discrete intervals are cut at 0%, 25%, 50%, 75%, 100%

It can be used like

surv = model.predict_surv(df_test, batch_size=val_batch_size)
risk = 1 - surv

for more details please refer to the evaluation function

SurvTRACE/survtrace/evaluate_utils.py

Line 6 in 0d40f37

class Evaluator:

from survtrace.

Jwenyi commented on August 30, 2024

Thanks a lot!! It's really helpful for me 😀

from survtrace.

RyanWangZf commented on August 30, 2024

Thanks a lot!! It's really helpful for me 😀

It's my pleasure~ welcome to star our projects if it's helpful 😇

from survtrace.

Jwenyi commented on August 30, 2024

Thanks a lot!! It's really helpful for me 😀

It's my pleasure~ welcome to star our projects if it's helpful 😇

Surely!!

from survtrace.

Jwenyi commented on August 30, 2024

Hi Zifeng,
Sorry for disturbing you again but I encountered a new question during traning survtrace.😂
When I run a function "load_data" which from "dataset.py", it repoted that
_"UserWarning: Got event/censoring at start time. Should be removed! It is set s.t. it has no contribution to loss. warnings.warn("""Got event/censoring at start time. Should be removed! It is set s.t. it has no contribution to loss.""_
it from a code
y = labtrans.transform(*get_target(df)) # y = (discrete duration, event indicator)
However, I had checked my input data and found no censor or event existed in the begining time. And, I'm sure that all the patients did not meets "duration 0, event 1". May be this question is attributed to:

times = np.quantile(df["duration"][df["event"] == 1.0], horizons).tolist()
times
[389.500000125, 601.9999998, 1120.75]
In this code I see that the time interval has been set, however, I do had some patients whom "duration" are less than 389.5 and "event" are 1 (Death). Does it cause that question? If the answer is yes, I noted that even if I deleted these patients, the "times" will also change, and there will be new patients who do not meet the conditions.

How should I solve this problem? Or this problem does not affect the performance of the model and can therefore be ignored? I am eagerly looking forward to your reply.

P.S. ,part of my data are listed below, in which I show the patients who has the shortest duration in my data:

duration	event	AURKA.FGD6	AURKA.GABRP	CLDN9.IL27RA	DPYD.FANCI
90	1	0	1	1	0
92	0	0	0	0	0
100	0	0	0	1	1
100	0	0	1	1	1
103	1	1	1	0	0
108	0	1	1	0	1
112	0	1	1	1	1
120	0	0	1	1	0
126	1	1	1	0	0

from survtrace.

Jwenyi commented on August 30, 2024

P.S. I figure that maybe a patients with shortest duration shoul not be "Death"? So I also deleted this patients and unfortunately I encountered this warnings again..😂

from survtrace.

RyanWangZf commented on August 30, 2024

P.S. I figure that maybe a patients with shortest duration shoul not be "Death"? So I also deleted this patients and unfortunately I encountered this warnings again..😂

I check the code where this warning raises on

SurvTRACE/survtrace/utils.py

Lines 77 to 84 in 0d40f37

    
           if idx_durations.min() == 0: 
        
               warnings.warn("""Got event/censoring at start time. Should be removed! It is set s.t. it has no contribution to loss.""") 
        
               t_frac[idx_durations == 0] = 0 
        
               events[idx_durations == 0] = 0 
        
           idx_durations = idx_durations - 1 
        
           # get rid of -1 
        
           idx_durations[idx_durations < 0] = 0 
        
           return idx_durations.astype('int64'), events.astype('float32'), t_frac.astype('float32')

Before the operation on line 81, the patient has duration < 389.500000125 is actually assigned index 1 instead of 0. So, this warning raises because durations[idx_durations == 0] == 0.

Could you add a break point there and print(durations[idx_durations == 0]) to show me what's the output?

from survtrace.

Jwenyi commented on August 30, 2024

P.S. I figure that maybe a patients with shortest duration shoul not be "Death"? So I also deleted this patients and unfortunately I encountered this warnings again..😂

I check the code where this warning raises on

SurvTRACE/survtrace/utils.py

Lines 77 to 84 in 0d40f37

if idx_durations.min() == 0:

warnings.warn("""Got event/censoring at start time. Should be removed! It is set s.t. it has no contribution to loss.""")

t_frac[idx_durations == 0] = 0

events[idx_durations == 0] = 0

idx_durations = idx_durations - 1

# get rid of -1

idx_durations[idx_durations < 0] = 0

return idx_durations.astype('int64'), events.astype('float32'), t_frac.astype('float32')

Before the operation on line 81, the patient has duration < 389.500000125 is actually assigned index 1 instead of 0. So, this warning raises because durations[idx_durations == 0] == 0.

Could you add a break point there and print(durations[idx_durations == 0]) to show me what's the output?

Thanks Zifeng! I checked there and found the output is "92.00000018".

from survtrace.

RyanWangZf commented on August 30, 2024

P.S. I figure that maybe a patients with shortest duration shoul not be "Death"? So I also deleted this patients and unfortunately I encountered this warnings again..😂

I check the code where this warning raises on

SurvTRACE/survtrace/utils.py

Lines 77 to 84 in 0d40f37

if idx_durations.min() == 0:

warnings.warn("""Got event/censoring at start time. Should be removed! It is set s.t. it has no contribution to loss.""")

t_frac[idx_durations == 0] = 0

events[idx_durations == 0] = 0

idx_durations = idx_durations - 1

# get rid of -1

idx_durations[idx_durations < 0] = 0

return idx_durations.astype('int64'), events.astype('float32'), t_frac.astype('float32')

Before the operation on line 81, the patient has duration < 389.500000125 is actually assigned index 1 instead of 0. So, this warning raises because durations[idx_durations == 0] == 0.
Could you add a break point there and print(durations[idx_durations == 0]) to show me what's the output?

Thanks Zifeng! I checked there and found the output is "92.00000018".

Do you mean there is only one output and it's not zero? It's weird 😂
I copied these transform code from pycox
https://github.com/havakv/pycox/blob/d384d4f0ac89ddd8458daabfd3fe271ff26542e3/pycox/preprocessing/label_transforms.py#L150

don't know what happened.

But if there is only one output, only this single data will be deleted and I guess it will not influence the results much 😇

from survtrace.

Jwenyi commented on August 30, 2024

Thanks! I checked my inputed data and processed data and found that the samples size seemed to change very little. 😀

from survtrace.

How to prepare model input from my own data? about survtrace HOT 13 CLOSED

Comments (13)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

duration	event	AURKA.FGD6	AURKA.GABRP	CLDN9.IL27RA	DPYD.FANCI
90	1	0	1	1	0
92	0	0	0	0	0
100	0	0	0	1	1
100	0	0	1	1	1
103	1	1	1	0	0
108	0	1	1	0	1
112	0	1	1	1	1
120	0	0	1	1	0
126	1	1	1	0	0

	if idx_durations.min() == 0:
	warnings.warn("""Got event/censoring at start time. Should be removed! It is set s.t. it has no contribution to loss.""")
	t_frac[idx_durations == 0] = 0
	events[idx_durations == 0] = 0
	idx_durations = idx_durations - 1
	# get rid of -1
	idx_durations[idx_durations < 0] = 0
	return idx_durations.astype('int64'), events.astype('float32'), t_frac.astype('float32')

duration	event	AURKA.FGD6	AURKA.GABRP	CLDN9.IL27RA	DPYD.FANCI
90	1	0	1	1	0
92	0	0	0	0	0
100	0	0	0	1	1
100	0	0	1	1	1
103	1	1	1	0	0
108	0	1	1	0	1
112	0	1	1	1	1
120	0	0	1	1	0
126	1	1	1	0	0

duration	event	AURKA.FGD6	AURKA.GABRP	CLDN9.IL27RA	DPYD.FANCI
90	1	0	1	1	0
92	0	0	0	0	0
100	0	0	0	1	1
100	0	0	1	1	1
103	1	1	1	0	0
108	0	1	1	0	1
112	0	1	1	1	1
120	0	0	1	1	0
126	1	1	1	0	0