Comments (22)
@sergiopasra - just to check, does this depend on the optimization level you compile with?
from erfa.
@astrofrog Yes, with -O0 and -O1 it works, with -O2, -O3 it doesn't (only in i386, x86_64 works with both)
from erfa.
@sergiopasra - do you see the same issue when compiling SOFA? If so, we'll need to report it upstream.
from erfa.
@astrofrog I obtain the same if I change the compilation flags of sofa from "-pedantic -Wall -W -O" to "-pedantic -Wall -W -O2"
I'm not sure that they are going to consider this a bug
from erfa.
How do we know this isn't a GCC6 compiler bug? How much does the value change between GCC5 and GCC6? Does reordering some expressions fix it? How much does the tolerance have to change to make the test pass (not very much)? What is the actual required precision of this calculation?
from erfa.
This also affects Debian: Bug 835105.
from erfa.
I played around a bit with the different optimization options: The source code in question is src/starpv.c
, and the optimization option is -fcaller-saves
.
Applying -O2 -fno-caller-saves
to gcc -c starpv.c
works well, -O1 -fcaller-saves
shows the failure.
from erfa.
So this looks like a GCC compiler bug then.
from erfa.
Agreed. I could extract the following code from starpv.c
:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double f(double x) {
return sqrt(1.0 - x) - 1.0;
}
double g(double x) {
double res = f(x);
printf("r =%.20g\n", res);
return res;
}
int main(void) {
double x = 3.1740879016271733482e-09;
double r1 = f(x);
double r2 = g(x);
printf("r1=%.20g r2=%.20g diff=%20g\n", r1, r2, r2-r1);
exit(0);
}
This code should print the same number for r1
and r2
, and a diff of 0. When compiled with gcc -O2 -o m m.c -lm
, I get the result
r =-1.5870439520936432953e-09
r1=-1.5870439407095204842e-09 r2=-1.5870439520936432953e-09 diff= -1.13841e-17
when I add -fno-caller-saves
, everything is fine. The numbers are taken from the erfa test case.
from erfa.
FYI: I added a gcc bug for this issue.
from erfa.
not really a gcc bug, i386 does not have reliable floating point math due to the 80 bit x87 fpu, if you need it try the -fexcess-precision=standard or -ffloat-store or -mfpmath=sse
see https://gcc.gnu.org/wiki/x87note (or the super old gcc issue 323)
from erfa.
@juliantaylor When thinking about the numbers here, I don't think this is the issue: with double precision (64 bit), we have a mantissa of 52(+1) bit and a precision of ~16 digits. The two numbers in the test case however differ already in the 8th digit. Having 80 bit here, or converting 80 <-> 64 bit cannot explain this.
from erfa.
don't know what you are computing, but that they differ on the 8th digit does not necessarily imply that its not the excess precision that is to blame. Rounding differences can easily amplify .
Also that fno-caller-saves "fixes" it also speaks for it, that flag basically causes a limited form of -ffloat-store (but possibly worse for performance as it also affects non float register allocations).
Does -fexcess-precision=standard fix the issue?
from erfa.
The code is above, and I see nothing there that could amplify the error. From rounding 80->64 bit, I see no reason why it should be legitimate to have an error in the 8th digit; the error should be in the order of the precision. -fexcess-precision=standard
fixes the problem, however.
from erfa.
I don't think its obvious to say that from that code, there are several things in it that could amplify it, subtractions, divisions its got it all.
An actual compiler mis-compilation that produces a 8th digit accuracy difference on the other hand seems less likely to me, in particular one that disabling caller saves fixes.
In the end only a proper numerical analysis or looking at the dissambly can prove what is to blame, but for me the evidence that it is the well known x87 issue is pretty overwhelming while a mis-compilation is still just speculation.
Then its not really worthwhile to dive into this deeper to see what it is, you should never do numerics on x87 fpus without proper compiler flags anyway (-mfp=sse is the best option for determinism and performance).
from erfa.
OK, agreed (and I am going to close the gcc bug here as well). However, I am not happy with just enabling these flags -- IMO they don't solve the underlying problem, but hide it: The function eraStarpv()
is quite inaccurate because of sqrt(1.0 - betr*betr - bett*bett) - 1.0
, if betr
and bett
are very small. In the test case of astropy, they are ~ 10^-4, which translates into an accuracy of only ~ 8 digits. Astropy however seems to need a higher accuracy in this loop when handling ephemerides.
IMO, there are two things is one thing needed/possible here:
- the mentioned code line should be changed for small values of
betr
andbett
(use polynom instead of square root in this case, or similar). This should be discussed with upstream. Or at least the limited precision should be documented. Astropy should not assume that the precision ofEdit: I mixed up things here. This was nonsense ;-)eraStarpv()
is very high, and change the loop accordingly. I will copy my comments to astropy/astropy#5425 as well.
The point here is IMO that the x87 FPU does not give wrong results, but just different ones. Both results are as expected within the accuracy, the 64-bit results are just more predictable. This gives us the chance to find exactly such problems, and therefore I would vote against just setting -mfp=sse
or such. Instead, i386 CI tests would be quite useful.
from erfa.
indeed that expression is poor for small values
herbie recommends this replacement: http://herbie.uwplse.org/demo/6c625a8ebdbd46954b959c975040214b/graph.html
from erfa.
Funnily (but as expected), the i386 result seems to be the better one ;-)
from erfa.
Its not actually very surprising, the x87 fpu does have 80 bit precision for intermediate results, so it being better is not unusual. But its indeterminism is often outweighs this. If you have something that needs 80 bits you can use long doubles
at some performance cost on newer machines (and less portability as some arches don't have 80 bit long doubles).
from erfa.
Well, deterministically inaccurate results are not really what one wants...
from erfa.
you can easily construct cases where 80 bit is not enough either, its just shifting the problem a tiny bit.
determinism is far more valuable.
from erfa.
An algorithm should give predictable results within the expected accuracy. If it does not, it is the algorithm which is to blame, not the "unpredictability" of the FPU. If we would have just enabled -mfp=sse
, we would not have detected the cause of the problem. Therefore, unpredicted results within the specified accuracy help to keep good algorithms, while -mfp=sse
just lets it look good but hides the real problem.
Just make sure that the algorithm is robust, then one doesn't need to care about additional unexpected precision.
from erfa.
Related Issues (20)
- Make a way to use local device's leap second information
- ERFA 1.7 still has version number 1.6!? HOT 2
- Fix README.rst -- update release info HOT 6
- Porting ERFA to different languages - licensing HOT 17
- starpm does not work for moderate proper motion/no parallax case HOT 1
- Update to match SOFA 20200721
- Unexpected result from d2dtf() very near day boundary in UTC HOT 9
- Have links in docs/README to other languages?
- change in header files to make things better for C++? HOT 4
- Update to SOFA 17a HOT 2
- Should macros in erfam.h be considered part of the API? HOT 1
- Auto-update zenodo HOT 1
- why use fmod in calculating Fundamental arguments in eraNut80?
- Wrong result from d2dtf() in pre-1972 UTC
- d2dtf(), dtf2d(), utctai(), dat(): Inexact table and method of detecting jump HOT 13
- proposal: use integer times internally HOT 8
- Single-precision floating-point level rounding errors when using pmsafe HOT 44
- Need to update to SOFA 19 HOT 2
- Check version number consistency on updates
- Question on tai to utc conversion HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from erfa.