When you use the crosstab function in Python, you get a pivot table of two variables. The index of this pivot table consists of the index parameter in which you write the pivot table function. When we want these index values to be a frequency distribution range, the crosstab or pivot functions in Python cannot respond to this need. This function is written to meet this need and is a pivot table function where two variables can be indexed with statistically significant frequency distribution ranges.
The function takes 4 parameters:
given_df = dataframe
given_series1= column of dataframe where you want the pivot table to index
given_series2= the column you want the dataframe to come into the columns of the pivot table
m= how many intervals the frequency distribution will consist of;
If m = None, it divides statistically significant frequency ranges by default.
multivariate_freq_dist(given_df, given_series1, given_series2, m)
Please upload the data file in the files section. df=pd.read_excel('data.xlsx') Then write the following command: multivariate_freq_dist(df, 'bioistatistik_notu', 'lise_basarisi', 5)
References:
- Semra ORAL ERBAŞ, Olasılık ve İstatistik
- E. Alptekin ESIN, Sağlık Bilimlerinde İstatistik