The pythurstonian from oscartgiles

Add unit tests

Working directly with Thurstonian utilities?

Hey Oscar!

This packages looks really cool. I've been playing around with it since yesterday, and am liking the functions. Just wanted to ask if/how one could go about working with Thurstonian latent utilities directly? Is this even possible?

I've tried your simulation using two within-subject conditions, with latent "beta" utility parameters corresponding to random samples from independent Gaussian distributions Condition 1 = [0.0, 1.0, 2.0], and Condition 2 = [0.0, 0.1, 0.2], where the variance of each is fixed to 1. Your results from the script in terms of permutations and Kendall's W are replicated perfectly, but when I dig into the posterior samples of the model, results seem odd.

If I'm following your code and previous discussions over at the Stan forum correctly, the z_hat parameter should correspond to the posterior draws of the Gaussians, fixing the first ranking at zero. The beta parameter forms the posterior difference of each ranking from the first (these relative differences appear to be the crux of PyThurstonian's modelling approach).

Working with the differences directly:

#Extract posterior samples per condition
C1_Beta = [myThurst.samples['beta'][i][0] for i in range(9999)] 
C2_Beta = [myThurst.samples['beta'][i][1] for i in range(9999)] 

#Find mean of differences from first rank per condition
np.array(C1_Beta).mean(axis = 0), np.array(C2_Beta).mean(axis = 0)

# Returns:  
(array([0.        , 1.00341984, 9.85625661]),
 array([0.        , 1.29277174, 1.44932662]))

This doesn't look quite right to me - are the differences relative to the first ranking within each condition really preserved accurately?

Why is the difference between the first and third object in Condition 1 blown out to this extent? Condition 2 results looks more accurate to our simulated data, but why are the differences centred on 1 and not 0? Finally, if the difference between the first and second rank in Condition 1 on the normal is ~1, and ~0.1 in Condition 2, why do they yield similar estimated differences between the conditions? Do the differences in latent scale dimensions not translate across conditions?

Just for reference, checking the variability of estimates in terms of the posterior distributions

pd.DataFrame(C1_Beta).loc[:,1:2].plot(kind='density')

pd.DataFrame(C2_Beta).loc[:,1:2].plot(kind='density')

I've tried a similar process for drawing directly from z_hat and the same broad pattern emerges.

Any guidance here?

Add documentation page

MKDocs documentation page
Decide where to host (github?)

Linux
Mac
Windows

oscartgiles / pythurstonian Goto Github PK

pythurstonian's People

Contributors

Stargazers

Watchers

Forkers

pythurstonian's Issues

Add unit tests

Working directly with Thurstonian utilities?

Add documentation page

Review existing code base

Use Poetry for dependencies

Add CI with github actions

Add pylint and black checks/formatting

Replace backend with PyCmdStan

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent