Comments (36)
Can you be more specific? In general, a multidimensional fit can be reformulated as a normal, scalar fit, and solved accordingly.
from gpufit.
I meant something link:
f(x_1,x_2,x_3 ; alpha_1, alpha_2) = {alpha_1x_1 + alpha_2x_2, alpha_1x_2 + alpha_2x_3, alpha_1alpha_2x_1 + e^alpha_2 x_3};
So my samples are point in 3D space and I have to fit those with that vector function. My actual case is much more complicated and doesn't have a simple closed form expression.
You can also imagine the function has to be fit using a LSE minimizer.
from gpufit.
Is it possible to fit such a function? Or is there a workaround for that?
from gpufit.
Yes, it can but you have to use a customized model function and fit estimator. Writing your own model function and fit estimator you can easily fit functions with vectors as output in Gpufit. See: http://gpufit.readthedocs.io/en/latest/customization.html
I guess using the built-in LSE estimator would have problems though because it assumes that the number of model values is equal to the number of independent variable values.
from gpufit.
According to the documentation I should...
- Define an additional model ID in file constants.h. (Easy...
- Implement a CUDA device function within a newly created .cuh file in folder Gpufit/Gpufit/models according to the following template.
__device__ void ... ( // ... = function name
float const * parameters,
int const n_fits,
int const n_points,
float * value,
float * derivative,
int const point_index,
int const fit_index,
int const chunk_index,
char * user_info,
std::size_t const user_info_size)
{
///////////////////////////// value //////////////////////////////
value[point_index] = ... ; // formula calculating fit model values
/////////////////////////// derivative ///////////////////////////
float * current_derivative = derivative + point_index;
current_derivative[0 * n_points] = ... ; // formula calculating derivative values with respect to parameters[0]
current_derivative[1 * n_points] = ... ; // formula calculating derivative values with respect to parameters[1]
.
.
.
}
How do I specify in this case, "the derivative is a matrix" and "the function should output three values" instead of one? I assume for the derivative I'll just address the array accordingly, it should be a problem.
- Include the newly created .cuh file in models.cuh (Or I just add my new model in an existing .cuh I guess)
- Add a switch case in the CUDA device function calculate_model() in file models.cuh to allow calling the newly added model function. (Easy to do I guess, is there anything I should keep in mind that I might possibly not considering?).
For the estimator I guess I'll be asking something later, but should I essentially implement a specific LSE estimater for my vector function?
Thx
from gpufit.
Hi, could anyone give me more details based on what I asked above? I can't figure out how to proper index the derivatives and function values in my case. The documentation doesn't help much...
from gpufit.
Hi,
I think what you're asking for is already possible using the existing Gpufit framework. Your function returns a vector of values, let's say 3 values, and you are fitting data that has three data values per coordinate.
In this situation you can treat the problem as a simultaneous fit to multiple datasets, with shared parameters. For example, let's say your data values are a 3 x N array (a vector of 3 values for N coordinates) with a value [Y1,Y2,Y3] at each coordinate N. You could concatenate these values into a long 1D array [Y1(x1), Y1(x2), Y1(x3)...... Y2(x1), Y2(x2), Y2(x3).... Y3(x1), Y3(x2), Y3(x3), .... ] for sending to Gpufit. Next, write your model function to return the appropriate value, depending on which element it is working on - i.e. if your model function is working on a point in the Y1 section of the array, then it should return the first element of the vector, if it is in the Y2 section, then the second element, and so on. The same is true for the derivative values. The function is always using the same parameter set, however.
The existing LSE or MLE estimators should already work for this case, without modification.
Does this solution address your question?
from gpufit.
I fully agree with superchromix. The existing LSE and MLE estimators will work with vector-valued functions as long as the number of data points per fit is the number of x coordinates times number of values per fit.
Just for completeness: an alternative ordering would be [Y1(x1) Y2(x1) Y3(x1) Y1(x2) Y2(x2) Y3(x3) ...].
The task reduces to implement a custom model function. Me might not yet have an example for that case. Might be good to have one.
from gpufit.
Hi superchromix, jkfindeisen.
Because there's no example that would expound the details of what you're proposing I'd need you to guide me... So please bear with me.
In this situation you can treat the problem as a simultaneous fit to multiple datasets, with shared parameters. For example, let's say your data values are a 3 x N array (a vector of 3 values for N coordinates) with a value [Y1,Y2,Y3] at each coordinate N.
So far I'm with you...
For example, let's say your data values are a 3 x N array (a vector of 3 values for N coordinates) with a value [Y1,Y2,Y3] at each coordinate N. You could concatenate these values into a long 1D array [Y1(x1), Y1(x2), Y1(x3)...... Y2(x1), Y2(x2), Y2(x3).... Y3(x1), Y3(x2), Y3(x3), .... ] for sending to Gpufit.
For the examples provided with GPUFit the generation of the samples is implemented in a separate function. In the case of the void gauss_fit_2d_example() {...}
this is where the samples generation is performed. I assume I'll need to create a function like this as unit test for my model, which convey the generation of samples according to the layout you mentioned, is this correct?
Next, write your model function to return the appropriate value, depending on which element it is working on - i.e. if your model function is working on a point in the Y1 section of the array, then it should return the first element of the vector, if it is in the Y2 section, then the second element, and so on.
According to the templates in the customization page, what parameter/parameters of the model function would tell me on what section of the array I'd be working on?
The same is true for the derivative values. The function is always using the same parameter set, however.
About the derivatives/hessians, I'm understanding that in this case the derivative would be a a matrix, while the hessian would be a 3D array in theory (though I'd re-index a 1D array to reproduce the same structure) is this correct?
from gpufit.
I assume I'll need to create a function like this as unit test for my model, which convey the generation of samples according to the layout you mentioned, is this correct?
It's up to you if you want to create such a unit test
According to the templates in the customization page, what parameter/parameters of the model function would tell me on what section of the array I'd be working on?
The "point index" tells you which element you are working on. Your model function would contain something like a switch statement, returning a different value depending on the point index.
About the derivatives/hessians, I'm understanding that in this case the derivative would be a a matrix, while the hessian would be a 3D array in theory (though I'd re-index a 1D array to reproduce the same structure) is this correct?
The derivatives are also returned as a 1D array, of length n_parameters * n_points. In the example I gave above for each parameter and data point you have three derivatives, and you need to return them with the same ordering as how you choose to organize the input data.
from gpufit.
Continuing with the example, the final size of your derivatives array would be n_parameters * n_points * length_of_function_vector (three in this case).
from gpufit.
To see how to organize the derivatives, you can look into e.g. the Gaussian model function code.
e.g. http://github.com/gpufit/Gpufit/blob/master/Gpufit/models/gauss_1d.cuh
from gpufit.
No. The derivatives are also returned as a 1D array, of length n_parameters * n_points. In the example I gave above for each parameter and data point you have three derivatives, and you need to return them with the same ordering as how you choose to organize the input data.
But in my case that 1D array will represent the Jacobian of my model right? How about the hessian instead?
In the example of the 1D gaussian function (which dependes on 4 parameters) your 1D array of derivatives represents the gradient of the model with respect to the parameters to be fitted. This means if I had two components that 1D array should represent a matrix. Is that right?
(It will still be a 1D array, but it's the interpretation that changes).
Unless, again, I'm missing something.
from gpufit.
You don't need to calculate the Hessian - you only want the derivative of the function with respect to each of the parameters. You are essentially treating your function, which returns a vector (of e.g. length three) as three separate functions.
from gpufit.
I won't need the hessian because the estimator won't change, right? I'll re-use the ones already available.
from gpufit.
Yes, you can use the standard estimator.
from gpufit.
Everything boils down to just create a new model with the features you mentioned then, if this is true I'll follow the guidelines you just gave me. I'll eventually post the example I have in mind, if it is of any interest.
from gpufit.
Great. Yes that would be helpful - we could add it to the docs. Post back here if you need any more guidance. You might want to create a simple toy model function to get a feeling for this first, before implementing your specific function.
from gpufit.
I'll try. Thank you.
from gpufit.
Hi, I've this toy example.
/* Description of the test function
* =================================
* Computes function value and derivatives of the function
*
* f(x[0],x[1],x[2]) = {p[0]*x[0] + p[1]*x[1],p[0]*x[1] + p[2]*x[2]}
* The derivative, namely the Jacobian with respect the the parameters is given by
* Jacobian(f) = {
* x[0], x[1], 0.0f,
* x[1], 0.0f, x[2]
* }
*
*/
__device__ void calculate_vector_function(
float const * parameters,
int const n_fits,
int const n_points,
float * value,
float * derivative,
int const point_index,
int const fit_index,
int const chunk_index,
char * user_info,
std::size_t const user_info_size) {
// indices
float * user_info_float = (float*)user_info;
float x = 0.0f;
if (!user_info_float)
{
x = point_index;
}
else if (user_info_size / sizeof(float) == n_points)
{
x = user_info_float[point_index];
}
else if (user_info_size / sizeof(float) > n_points)
{
int const chunk_begin = chunk_index * n_fits * n_points;
int const fit_begin = fit_index * n_points;
x = user_info_float[chunk_begin + fit_begin + point_index];
}
// parameters
float const * p = parameters;
// values and jacobian
if(!(point_index & 1)) {
value[point_index] = p[0]*x[0] + p[1]*x[1]; //first component
current_derivative[0 * n_points] = x[0]; //gradient first component
current_derivative[1 * n_points] = x[1];
current_derivative[2 * n_points] = 0.0f;
} else {
value[point_index] = p[0]*x[1] + p[2]*x[2]; //second component
current_derivative[3 * n_points] = x[1]; //gradient second component
current_derivative[4 * n_points] = 0.0f;
current_derivative[5 * n_points] = x[2];
}
}
Now, did you mean something like this? Also if you could clarify how do I take into account that some parameters are shared that would be helpful.
from gpufit.
I think it should be something more like this.
This really depends on how you are passing the data into Gpufit. Below, I assume that the data is interleaved - i.e. [Y1(0), Y2(0), Y1(1), Y2(1), ...] . I also removed the parts involving user_info, since it's not strictly necessary to use custom x values.
__device__ void calculate_vector_function(
float const * parameters,
int const n_fits,
int const n_points,
float * value,
float * derivative,
int const point_index,
int const fit_index,
int const chunk_index,
char * user_info,
std::size_t const user_info_size)
{
// indices
float * user_info_float = (float*)user_info;
float x = 0.0f;
x = point_index / 2; // INTEGER DIVISION
// parameters
float const * p = parameters;
// derivative
float * current_derivative = derivative + point_index;
// values and jacobian
if(point_index % 2 == 0) {
// even point indices - first element of the vector
value[point_index] = p[0]*x + p[1]*x; //first component
current_derivative[0 * n_points] = x; //gradient first component
current_derivative[1 * n_points] = x;
current_derivative[2 * n_points] = 0.0f;
} else {
// odd point indices - second element of the vector
value[point_index] = p[0]*x[1] + p[2]*x[2]; //second component
current_derivative[0 * n_points] = x; //gradient second component
current_derivative[1 * n_points] = 0.0f;
current_derivative[2 * n_points] = x;
}
}
`
from gpufit.
How about the shared parameters part? (You forgot to remove the array indices in the second component by the way). Still confused how is that supposed to be done. Moreover it seems to me you used the same coordinate x to compute the function value, but the function depends from three scalar values and returns two components. Basically as I said in the comment I'm trying to implement:
f(x[0],x[1],x[2]) = {p[0]*x[0] + p[1]*x[1],p[0]*x[1] + p[2]*x[2]}
from gpufit.
So, if your function is
f(x,y,z) = (P0 * x + P1 * y, P0 * y + P2 * z)
Then you just have to choose how you want to organize the X, Y, and Z coordinates, and the two values of the function, in the input array. The same thing is done in the 2D Gaussian example.
one possible organization:
[Y0(x0,y0,z0),Y1(x0,y0,z0),Y0(x1,y0,z0),Y1(x1,y0,z0),........]
it's up to you to decide this.
from gpufit.
If you have unequal numbers of X, Y, and Z values, then you would also have to specify the number of X, Y, and Z points to the model function, via user_info for example, or the function has no way of knowing what the x,y,z coordinates are.
from gpufit.
So, if your function is
f(x,y,z) = (P0 * x + P1 * y, P0 * x + P2 * y)
No, the function is
f(x,y,z) = (P0x + P1y,P0y + P2z);
For the coordinates organization I'm assuming this
x[0],y[0],z[0],x[1],y[1],z[1],...,x[i],y[i],z[i],....
If N are my samples, as you said I need an array 'v' of size 3*N, therefore I can retrive the coordinates x,y,z of the i-th sample as
(x,y,z) = (v[3i],v[3i+1],v[3*i+2]).
from gpufit.
If your function lives in a three dimensional space, then the total data size must be N_x * N_y * N_z, where N_x is the number of X coordinates, etc.
Unless, that is, your data is unevenly sampled, in which case you need to use the user_info parameter to pass in the coordinates.
from gpufit.
If your function lives in a three dimensional space, then the total data size must be N_x * N_y * N_z, where N_x is the number of X coordinates, etc.
I'm assuming N_x = N_y = N_y, so I'll have a total of N_x^3 samples.
Also I've modified my function model as follows
__device__ void calculate_vector_function(
float const * parameters,
int const n_fits,
int const n_points,
float * value,
float * derivative,
int const point_index,
int const fit_index,
int const chunk_index,
char * user_info,
std::size_t const user_info_size) {
// indices
int const n_points_x = cbrt((float)n_points);
int const point_index_z = point_index / (n_points_x*n_points_x);
int const point_index_y = (point_index - point_index_z*n_points_x*n_points_x)/n_points_x;
int const point_index_x = point_index - point_index_z*n_points_x*n_points_x - point_index_y * n_points_x;
float const argx = point_index_x;
float const argy = point_index_y;
float const argz = point_index_z;
// parameters
float const * p = parameters;
float * current_derivative = derivative + point_index;
// values and jacobian
if(point_index % 2 == 0) {
value[point_index] = p[0]*argx + p[1]*argy; //first component
current_derivative[0 * n_points] = argx; //gradient first component
current_derivative[1 * n_points] = argy;
current_derivative[2 * n_points] = 0.0f;
} else {
value[point_index] = p[0]*argy + p[2]*argz; //second component
current_derivative[0 * n_points] = argy; //gradient second component
current_derivative[1 * n_points] = 0.0f;
current_derivative[2 * n_points] = argz;
}
}
The reason for this bit:
// indices
int const n_points_x = cbrt((float)n_points);
int const point_index_z = point_index / (n_points_x*n_points_x);
int const point_index_y = (point_index - point_index_z*n_points_x*n_points_x)/n_points_x;
int const point_index_x = point_index - point_index_z*n_points_x*n_points_x - point_index_y * n_points_x;
It's because given that I have N^3 samples then the the the indexing in the linear array follows the formula
i*N^2 + j*N + k; //i index for x, j index for y, k index for z
The reason for this bit:
float const argx = point_index_x;
float const argy = point_index_y;
float const argz = point_index_z;
It's because the gauss 2D function does something similar, not sure why though.
from gpufit.
The multi-dimensionality makes this "toy" model more complicated, but I think this is what you want.
__device__ void calculate_vector_function(
float const * parameters,
int const n_fits,
int const n_points,
float * value,
float * derivative,
int const point_index,
int const fit_index,
int const chunk_index,
char * user_info,
std::size_t const user_info_size)
{
// indices
int_const n_function_values = 2;
int const n_points_x = cbrt((float) (n_points/n_function_values));
int const n_points_y = n_points_x;
int const n_points_z = n_points_x;
int const point_index_z = point_index / (n_points_x*n_points_y*n_function_values);
int const point_index_y = (point_index
- (point_index_z*n_points_x*n_points_y*n_function_values) )
/ (n_points_x*n_function_values);
int const point_index_x = ( (point_index
- (point_index_z*n_points_x*n_points_y*n_function_values))
- (point_index_y*n_points_x*n_function_values) )
/ (n_function_values);
float const x = (float) point_index_x;
float const y = (float) point_index_y;
float const z = (float) point_index_z;
// parameters
float const * p = parameters;
// derivative
float * current_derivative = derivative + point_index;
// values and jacobian
if(point_index % 2 == 0) {
// even point indices - first element of the vector
value[point_index] = p[0]*x + p[1]*y; //first component
current_derivative[0 * n_points] = x; //gradient first component
current_derivative[1 * n_points] = y;
current_derivative[2 * n_points] = 0.0f;
} else {
// odd point indices - second element of the vector
value[point_index] = p[0]*y + p[2]*z; //second component
current_derivative[0 * n_points] = x; //gradient second component
current_derivative[1 * n_points] = 0.0f;
current_derivative[2 * n_points] = z;
}
}
from gpufit.
And how do I use this for the fitting now? Should I fit just one function? or two?
The code below is essentially an attempt to modify the gauss2D example in order to fit my vector field.
Can you tell me if my set up (before the call to the gpufit function) is correct?
void vector_function_example() {
/*
Generating sample test for fitting a vector function
*/
// number of fits, fit points and parameters
size_t const n_components = 2;
size_t const n_fits = n_components * 1;;
size_t const size_x = 50;
size_t const n_points_per_fit = size_x * size_x* size_x;
size_t const n_model_parameters = 3;
// true parameters (amplitude, center x position, center y position, width, offset)
std::vector< float > true_parameters{ 1.f, 1.5f, 2.5f};
std::cout << "generate example data" << std::endl;
// initialize random number generator
std::mt19937 rng;
rng.seed(0);
std::uniform_real_distribution< float> uniform_dist(0, 1);
// initial parameters (randomized)
std::vector< float > initial_parameters(n_model_parameters);
//for (size_t i = 0; i < n_fits; i++)
//{
for (size_t j = 0; j < n_model_parameters; j++)
{
initial_parameters[j] = true_parameters[j] + uniform_dist(rng); //This is probably wrong...
}
//}
// generate x, y and z values
std::vector< float > x(n_points_per_fit);
std::vector< float > y(n_points_per_fit);
std::vector< float > z(n_points_per_fit);
for (size_t i = 0; i < size_x; i++)
{
for (size_t j = 0; j < size_x; j++)
{
for(size_t k = 0; k < size_x; k++) {
x[i * size_x*size_x + j*size_x + k] = static_cast<float>(k);
y[i * size_x*size_x + j*size_x + k] = static_cast<float>(j);
z[i * size_x*size_x + j*size_x + k] = static_cast<float>(i);
}
}
}
// generate test data with Poisson noise
std::vector< float > temp(2*n_points_per_fit); //two components per fit
generate_vector_function(x, y, z, true_parameters, temp);
std::vector< float > data(n_fits * n_points_per_fit);
for (size_t i = 0; i < n_fits; i++)
{
for (size_t j = 0; j < n_points_per_fit; j++)
{
std::poisson_distribution< int > poisson_dist(temp[j]+0.1);
data[i * n_points_per_fit + j] = static_cast<float>(poisson_dist(rng));
}
}
// tolerance
float const tolerance = 0.001f;
// maximum number of iterations
int const max_number_iterations = 20;
// estimator ID
int const estimator_id = LSE;
// model ID
int const model_id = VECTOR_FUNCTION;
// parameters to fit (all of them)
std::vector< int > parameters_to_fit(n_model_parameters, 1);
// output parameters
std::vector< float > output_parameters(n_model_parameters);
std::vector< int > output_states(n_fits);
std::vector< float > output_chi_square(n_fits);
std::vector< int > output_number_iterations(n_fits);
// call to gpufit (C interface)
std::chrono::high_resolution_clock::time_point time_0 = std::chrono::high_resolution_clock::now();
int const status = gpufit
(
n_fits,
n_points_per_fit,
data.data(),
0,
model_id,
initial_parameters.data(),
tolerance,
max_number_iterations,
parameters_to_fit.data(),
estimator_id,
0,
0,
output_parameters.data(),
output_states.data(),
output_chi_square.data(),
output_number_iterations.data()
);
std::chrono::high_resolution_clock::time_point time_1 = std::chrono::high_resolution_clock::now();
// check status
if (status != ReturnState::OK)
{
throw std::runtime_error(gpufit_get_last_error());
}
// print execution time
std::cout << "execution time "
<< std::chrono::duration_cast<std::chrono::milliseconds>(time_1 - time_0).count() << " ms" << std::endl;
// get fit states
std::vector< int > output_states_histogram(5, 0);
for (std::vector< int >::iterator it = output_states.begin(); it != output_states.end(); ++it)
{
output_states_histogram[*it]++;
}
std::cout << "ratio converged " << (float)output_states_histogram[0] / n_fits << "\n";
std::cout << "ratio max iteration exceeded " << (float)output_states_histogram[1] / n_fits << "\n";
std::cout << "ratio singular hessian " << (float)output_states_histogram[2] / n_fits << "\n";
std::cout << "ratio neg curvature MLE " << (float)output_states_histogram[3] / n_fits << "\n";
std::cout << "ratio gpu not read " << (float)output_states_histogram[4] / n_fits << "\n";
// compute mean of fitted parameters for converged fits
std::vector< float > output_parameters_mean(n_model_parameters, 0);
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_mean[j] += output_parameters[i * n_model_parameters + j];
}
}
}
// normalize
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_mean[j] /= output_states_histogram[0];
}
// compute std of fitted parameters for converged fits
std::vector< float > output_parameters_std(n_model_parameters, 0);
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_std[j]
+= (output_parameters[i * n_model_parameters + j] - output_parameters_mean[j])
* (output_parameters[i * n_model_parameters + j] - output_parameters_mean[j]);
}
}
}
// normalize and take square root
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_std[j] = sqrt(output_parameters_std[j] / output_states_histogram[0]);
}
// print true value, fitted mean and std for every parameter
for (size_t j = 0; j < n_model_parameters; j++)
{
std::cout
<< "parameter " << j
<< " true " << true_parameters[j]
<< " fitted mean " << output_parameters_mean[j]
<< " std " << output_parameters_std[j] << std::endl;
}
// compute mean chi-square for those converged
float output_chi_square_mean = 0;
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
output_chi_square_mean += output_chi_square[i];
}
}
output_chi_square_mean /= static_cast<float>(output_states_histogram[0]);
std::cout << "mean chi square " << output_chi_square_mean << std::endl;
// compute mean number of iterations for those converged
float output_number_iterations_mean = 0;
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
output_number_iterations_mean += static_cast<float>(output_number_iterations[i]);
}
}
// normalize
output_number_iterations_mean /= static_cast<float>(output_states_histogram[0]);
std::cout << "mean number of iterations " << output_number_iterations_mean << std::endl;
}
from gpufit.
Note that my modified example may not be the fastest solution - it is meant for illustrating the point. Also, the modulo operator is supposedly slow in CUDA.
from gpufit.
Doesn't matter for now, as long as it works for now. I'll work out some of those details later.
from gpufit.
I have to leave this for now. Hopefully you can see the point that this is all about deciding how to organize the multiple dimensions and multiple elements of the function in memory. Each thread on the GPU needs to know what it is responsible for computing, which X, Y, and Z coordinates it represents and which element of the function it will calculate. That's all that you need to do. When passing the data and initial parameters into Gpufit, you need to organize them in the same way that your model function is expecting them. You also need to conform to the definitions of n_fits, n_points, and n_parameters, so that Gpufit knows how large the various arrays are.
from gpufit.
I'm still missing the shared parameters bit. According to the code of the gauss2d example I can see that for each fit we have a different parameters vector, and each fit will use such vector independently. In my case
instead I'm missing how to specify that both fittings should rely on the same parameter vector (this is in the last function I posted).
Thank you for you help anyway, I appreciate it.
from gpufit.
I rewrote your calling function. See my changes below.
void vector_function_example() {
/*
Generating sample test for fitting a vector function
*/
// number of fits, fit points and parameters
size_t const n_components = 2;
size_t const n_fits = 1;
size_t const size_x = 50;
size_t const n_points_per_fit = size_x * size_x* size_x * n_components;
size_t const n_model_parameters = 3;
size_t temp, offset;
// true parameters
std::vector< float > true_parameters{ 1.f, 1.5f, 2.5f};
// initial parameters
std::vector< float > initial_parameters(n_fits*n_model_parameters);
for (size_t i = 0; i < n_fits; i++)
{
for (size_t j = 0; j < n_model_parameters; j++)
{
initial_parameters[(i*n_model_parameters) + j] = true_parameters[j] + 0.1;
}
}
std::vector< float > data(n_fits * n_points_per_fit);
for (size_t i = 0; i < n_fits; i++)
{
offset = i * n_points_per_fit;
temp = 0;
for (size_t z = 0; z < size_x; z++)
{
for (size_t y = 0; y < size_x; y++)
{
for (size_t x = 0; x < size_x; x++)
{
data[offset + temp] = true_parameters[0]*x + true_parameters[1]*y;
temp = temp + 1;
data[offset + temp] = true_parameters[0]*y + true_parameters[2]*z;
temp = temp + 1;
}
}
}
}
// tolerance
float const tolerance = 0.001f;
// maximum number of iterations
int const max_number_iterations = 20;
// estimator ID
int const estimator_id = LSE;
// model ID
int const model_id = VECTOR_FUNCTION;
// parameters to fit (all of them)
std::vector< int > parameters_to_fit(n_model_parameters, 1);
// output parameters
std::vector< float > output_parameters(n_fits * n_model_parameters);
std::vector< int > output_states(n_fits);
std::vector< float > output_chi_square(n_fits);
std::vector< int > output_number_iterations(n_fits);
// call to gpufit (C interface)
int const status = gpufit
(
n_fits,
n_points_per_fit,
data.data(),
0,
model_id,
initial_parameters.data(),
tolerance,
max_number_iterations,
parameters_to_fit.data(),
estimator_id,
0,
0,
output_parameters.data(),
output_states.data(),
output_chi_square.data(),
output_number_iterations.data()
);
from gpufit.
Hi again, this is the output I'm getting:
Running lmfit.run(tolerance_);
Running lmfit_cuda.run();
Finish lmfit_cuda.run();
execution time lmfit_cuda.run() 196 ms
Finish lmfit.run(tolerance_);
execution time lmfit.run(...) 201 ms
execution time 477 ms
ratio converged 1
ratio max iteration exceeded 0
ratio singular hessian 0
ratio neg curvature MLE 0
ratio gpu not read 0
parameter 0 true 1 fitted mean 1 std 0
parameter 1 true 1.5 fitted mean 1.5 std 0
parameter 2 true 2.5 fitted mean 2.5 std 0
mean chi square 1.37772e-05
mean number of iterations 10
Example completed!
Press ENTER to exit
Which seems to be correct. I need to understand a bit how the whole thing works because I have many more parameters.
I post the two functions then:
model to be fit:
__device__ void calculate_vector_function(
float const * parameters,
int const n_fits,
int const n_points,
float * value,
float * derivative,
int const point_index,
int const fit_index,
int const chunk_index,
char * user_info,
std::size_t const user_info_size)
{
// indices
int const n_function_values = 2;
int const n_points_x = cbrt((float) (n_points/2));
int const n_points_y = n_points_x;
int const n_points_z = n_points_x;
int const point_index_z = point_index / (n_points_x*n_points_y*n_function_values);
int const point_index_y = (point_index - (point_index_z*n_points_x*n_points_y*n_function_values) ) / (n_points_x*n_function_values);
int const point_index_x = ( (point_index - (point_index_z*n_points_x*n_points_y*n_function_values)) - (point_index_y*n_points_x*n_function_values) ) / (n_function_values);
float const x = (float) point_index_x;
float const y = (float) point_index_y;
float const z = (float) point_index_z;
// parameters
float const * p = parameters;
// derivative
float * current_derivative = derivative + point_index;
// values and jacobian
if(point_index % 2 == 0) {
// even point indices - first element of the vector
value[point_index] = p[0]*x + p[1]*y; //first component
current_derivative[0 * n_points] = x; //gradient first component
current_derivative[1 * n_points] = y;
current_derivative[2 * n_points] = 0.0f;
} else {
// odd point indices - second element of the vector
value[point_index] = p[0]*y + p[2]*z; //second component
current_derivative[0 * n_points] = x; //gradient second component
current_derivative[1 * n_points] = 0.0f;
current_derivative[2 * n_points] = z;
}
}
test case:
void vector_function_example() {
/*
Generating sample test for fitting a vector function
*/
// number of fits, fit points and parameters
size_t const n_components = 2;
size_t const n_fits = 1;
size_t const size_x = 50;
size_t const n_points_per_fit = size_x * size_x* size_x * n_components;
size_t const n_model_parameters = 3;
size_t temp, offset;
// true parameters
std::vector< float > true_parameters{ 1.f, 1.5f, 2.5f };
// initial parameters
std::vector< float > initial_parameters(n_fits*n_model_parameters);
for (size_t i = 0; i < n_fits; i++)
{
for (size_t j = 0; j < n_model_parameters; j++)
{
initial_parameters[(i*n_model_parameters) + j] = true_parameters[j] + 0.1;
}
}
std::vector< float > data(n_fits * n_points_per_fit);
for (size_t i = 0; i < n_fits; i++)
{
offset = i * n_points_per_fit;
temp = 0;
for (size_t z = 0; z < size_x; z++)
{
for (size_t y = 0; y < size_x; y++)
{
for (size_t x = 0; x < size_x; x++)
{
data[offset + temp] = true_parameters[0] * x + true_parameters[1] * y;
temp = temp + 1;
data[offset + temp] = true_parameters[0] * y + true_parameters[2] * z;
temp = temp + 1;
}
}
}
}
// tolerance
float const tolerance = 0.001f;
// maximum number of iterations
int const max_number_iterations = 20;
// estimator ID
int const estimator_id = LSE;
// model ID
int const model_id = VECTOR_FUNCTION;
// parameters to fit (all of them)
std::vector< int > parameters_to_fit(n_model_parameters, 1);
// output parameters
std::vector< float > output_parameters(n_fits * n_model_parameters);
std::vector< int > output_states(n_fits);
std::vector< float > output_chi_square(n_fits);
std::vector< int > output_number_iterations(n_fits);
// call to gpufit (C interface)
std::chrono::high_resolution_clock::time_point time_0 = std::chrono::high_resolution_clock::now();
int const status = gpufit
(
n_fits,
n_points_per_fit,
data.data(),
0,
model_id,
initial_parameters.data(),
tolerance,
max_number_iterations,
parameters_to_fit.data(),
estimator_id,
0,
0,
output_parameters.data(),
output_states.data(),
output_chi_square.data(),
output_number_iterations.data()
);
std::chrono::high_resolution_clock::time_point time_1 = std::chrono::high_resolution_clock::now();
// check status
if (status != ReturnState::OK)
{
throw std::runtime_error(gpufit_get_last_error());
}
// print execution time
std::cout << "execution time "
<< std::chrono::duration_cast<std::chrono::milliseconds>(time_1 - time_0).count() << " ms" << std::endl;
// get fit states
std::vector< int > output_states_histogram(5, 0);
for (std::vector< int >::iterator it = output_states.begin(); it != output_states.end(); ++it)
{
output_states_histogram[*it]++;
}
std::cout << "ratio converged " << (float)output_states_histogram[0] / n_fits << "\n";
std::cout << "ratio max iteration exceeded " << (float)output_states_histogram[1] / n_fits << "\n";
std::cout << "ratio singular hessian " << (float)output_states_histogram[2] / n_fits << "\n";
std::cout << "ratio neg curvature MLE " << (float)output_states_histogram[3] / n_fits << "\n";
std::cout << "ratio gpu not read " << (float)output_states_histogram[4] / n_fits << "\n";
// compute mean of fitted parameters for converged fits
std::vector< float > output_parameters_mean(n_model_parameters, 0);
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_mean[j] += output_parameters[i * n_model_parameters + j];
}
}
}
// normalize
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_mean[j] /= output_states_histogram[0];
}
// compute std of fitted parameters for converged fits
std::vector< float > output_parameters_std(n_model_parameters, 0);
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_std[j]
+= (output_parameters[i * n_model_parameters + j] - output_parameters_mean[j])
* (output_parameters[i * n_model_parameters + j] - output_parameters_mean[j]);
}
}
}
// normalize and take square root
for (size_t j = 0; j < n_model_parameters; j++)
{
output_parameters_std[j] = sqrt(output_parameters_std[j] / output_states_histogram[0]);
}
// print true value, fitted mean and std for every parameter
for (size_t j = 0; j < n_model_parameters; j++)
{
std::cout
<< "parameter " << j
<< " true " << true_parameters[j]
<< " fitted mean " << output_parameters_mean[j]
<< " std " << output_parameters_std[j] << std::endl;
}
// compute mean chi-square for those converged
float output_chi_square_mean = 0;
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
output_chi_square_mean += output_chi_square[i];
}
}
output_chi_square_mean /= static_cast<float>(output_states_histogram[0]);
std::cout << "mean chi square " << output_chi_square_mean << std::endl;
// compute mean number of iterations for those converged
float output_number_iterations_mean = 0;
for (size_t i = 0; i != n_fits; i++)
{
if (output_states[i] == FitState::CONVERGED)
{
output_number_iterations_mean += static_cast<float>(output_number_iterations[i]);
}
}
// normalize
output_number_iterations_mean /= static_cast<float>(output_states_histogram[0]);
std::cout << "mean number of iterations " << output_number_iterations_mean << std::endl;
}
from gpufit.
Leaving this open as a reminder to add this case to the examples library and documentation.
from gpufit.
Related Issues (20)
- Communication with Cupy HOT 14
- ReadtheDocs broken link HOT 1
- Custom function fails to to fit in Python, only performing a single iteration. HOT 1
- Installation Difficulties HOT 2
- cuBlas Entry Point Not Found HOT 5
- Different constraints for every fit HOT 6
- <limits> not included in info.h HOT 1
- CMake not finding Python HOT 1
- Best debug strategy HOT 1
- Cpufit API documentation and external bindings HOT 3
- Automated build test and release with github actions HOT 2
- Cuda interface mallocs and pybind11
- Mac installation HOT 1
- Issues after installing on Linux computer
- Linux compiling errors HOT 2
- CPUFIT python binding HOT 1
- cmake doesn't find python interpreter and python wheel file does not make. HOT 1
- could not build python package by setuptools HOT 1
- ModuleNotFoundError: No module named 'cpufit' - Running 'make' HOT 3
- Use GAUSS_2D_ROTATED to fit a 2D Gaussian, the returned state is 2 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpufit.