Comments (18)
The point here is that we are saving floating point data (4 bytes floats are enough) in 2 bytes ints and with that we obviously loose precision. While the situation is not that bad in GSHHG because it uses a binning schema where by knowing the bin we already know the integer part of the bin corner orgin and the 2 bytes (0-65535) can be used to store only the decimal part, the situation in DCW is different. Here we want to store the data as polygons so we cannot use the binning and as a consequence the 2 bytes can provide only a precision of ~0.001 degrees (1 / 65535 = ~1.5e-5; 1.5e-5 * 180 ~= 0.0027...)
from gmt.
This is controlled by the FORMAT_FLOAT_OUT
parameter, e.g.,:
gmt coast -M -W1p -R0/10/0/10 --FORMAT_FLOAT_OUT=%.5f
from gmt.
It's not documented, so we should improve the documentation. PR is welcomed.
from gmt.
Yes, it is, but I imagine most people don't think about this, hence it may be hardcoded. Mentioning this in the docs is a good solution - will make a PR, probably.
from gmt.
The source data for dcw has only 5 digits, e.g. :
$ head orig/EU/NO.txt
> norway 0
5.127303 59.824047
5.139871 59.816860
5.140088 59.813950
5.135608 59.813023
5.131952 59.814030
5.128451 59.819134
5.122212 59.821632
5.127303 59.824047
> norway 1
Still, dumping the polygon gives 10 decimals, the 5 latter not being 0:
$ gmt coast -ENO -M | head
> Norway Segment 0
5.12749572793 59.8240650838
5.14002533806 59.8168021861
5.14002533806 59.8139777259
5.13557934737 59.8129689901
5.13194171862 59.8139777259
5.12830408988 59.8190214048
5.1222413753 59.8216441179
5.12749572793 59.8240650838
> Norway Segment 1
How can this be?
from gmt.
Single precision for floating-point numbers?
from gmt.
Well, it is documented in the sense that FORMAT_FLOAT_OUT
controls the format of all data that is is written in ascii, for all modules. When it doesn't, like it happened no to long ago with pscoast (I think) that is a bug.
But I noticed something that is worse. Although these data is does not have a high precision in localization, we are degrading it in about ~20 m. A consequence of the binning/scaling algorithm but something to have in mind for future.
from gmt.
How can this be?
If you look into the dcw-gmt.nc
file with HDF explorer
you will see that data is stored as short integers (2 bytes). This was the scheme used originally to compress the GSHHG data to ~45 MB, which was still huge 30 years ago.
from gmt.
Ok, so this is an explainable artifact then, I assume, based on your answer.
(I've read that numbers become complicated with all kind of strange rounding effects once you go into the float/long/etc. world, so won't go into that hole right now)
from gmt.
Thanks, interesting. So just 4 decimals are basically enough?
from gmt.
Not sure I understand the question. We cannot choose the number of significant decimals. We have what we have, and if I'm right the precision decreases as we move way from Greenwich and the Equator.
from gmt.
Alright, thanks. I might make a PR just noting that one may consider setting FORMAT_FLOAT_OUT
in the coast docs.
from gmt.
This is the script that creates the DCW file.
There it says:
# Set enough decimals to avoid bad rounding
rm -f gmt.conf
gmt set FORMAT_FLOAT_OUT %.14g
from gmt.
And from what I understand by looking at lines 116 to 150 of the script, I think the 2 byte range is set for each longitude and latitude range of each polygon. In practice this means that larger polygons have lower accuracy.
Explore the file and extract the scales used (dcw-scales.txt)
Here are the most severe cases (ISO code)
Scale | Scale Value | 1 / Value |
---|---|---|
AQ_lon:scale | 182.150451 | 0.00549 |
RU_lon:scale | 382.7710674 | 0.00261 |
US_lon:scale | 543.3403805 | 0.00184 |
CA_lon:scale | 741.4468027 | 0.00135 |
GE_lon:scale | 9765.136073 | 0.00010 |
IS_lat:scale | 19998.35216 | 0.00005 |
JM_lat:scale | 79588.59491 | 0.00001 |
That is, the worst accuracies are for the longitude values of the polygons of Antarctica (AQ), United States, Russia and Canada.
For the longitude of Germany we have a precision of 0.0001. For the latitude of Iceland we have 0.00005.
We serve a precision of 1e-6 (as the original data) for the latitude of Jamaica.
from gmt.
I made these two maps to compare the original data (from the orig
dir) and the process data (from DCW).
Ideally the lines should overlap. Differences are visible. But they don't look as bad as I expected.
Full script
origen=/home/federico/Github/GenericMappingTools/dcw-gmt/orig/
gmt begin Russia png
gmt coast -ERU+pred -R32/35/66/67 -Baf -JM25c
gmt plot $origen/AS/RU.txt -Wfaint,green
gmt basemap -L+w20k+o3c+f+u
gmt end
gmt begin Antartida png
gmt coast -EAQ+pred -R-65.5/-63/-66/-65 -Baf -JM25c
gmt plot $origen/AN/AQ.txt -Wfaint,green -l"Original Data"
gmt basemap -L+w20k+o3c+f+u
gmt end
from gmt.
Now I made a zoom and add the GSHHG data set.
If I assume that GSHHG is the truth, then there is no point in improving the accuracy of DCW data when its accuracy is low (for this zoom). "The Digital Chart of the World is a comprehensive 1:1,000,000 scale vector basemap of the world. "
Conclusion. I think DCW should be left as it is.
gmt begin Russia2 png
gmt coast -ERU+pred -R33:20/34/66:15/66.5 -Baf -JM25c
gmt plot $origen/AS/RU.txt -Wfaint,green
gmt coast -W
gmt basemap -L+w5k+o3c+f+u
gmt end
from gmt.
No, GSHHG cannot be assumed as the truth. We have permanent complains on hows the coast coastlines do not align with other data or satellite images. GSHHG is very old and suffers from using a different datum than modern data.
All our effort on coastlines/borders front should be concentrated in creating a new full++
coasts file, but the big shit is that we need to recreate a tool that is able to make such file.
from gmt.
Dealt with in #8524.
from gmt.
Related Issues (20)
- Ex52: figure is loaded externally HOT 6
- Wrong annotation and gridline positions for rounded values
- gmtconvert -S cannot match string longer than 21 characters on intel mac HOT 6
- Link Checker Report on 2024-05-26
- Figures dissapear in the documentation? HOT 2
- DOC: Add a central introduction page HOT 1
- docs: correct url to use HOT 4
- Link Checker Report on 2024-06-09
- inconsistent multithread (`-x` option) behavior HOT 3
- Link Checker Report on 2024-06-16
- grdinfo -L0 doesn't work for external wrappers HOT 12
- Consistent citation style in the cookbook HOT 2
- Link Checker Report on 2024-06-30
- Review GMT's known failures
- pscoast is wrong when the -J has a proj4 string with +x_0 and +y_0
- grdcut: segmentation fault if output extension is omitted HOT 3
- Link Checker Report on 2024-07-07
- Link Checker Report on 2024-07-14
- Link Checker Report on 2024-07-21
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gmt.