Comments (9)
Prove me wrong, but the sole purpose of the project is to demonstrate the possibility, not to actually get usable or sane results.
pifs is unusable and in no way viable project, it has no real value apart from academic use, i guess. This project is actually more of a joke, not real life solution. Although it works in some way, the issue you've raised is quite obvious and is a show-stopper.
Increasing the data chunk length leads to increased offset lengths which tend to take more memory than saving. Though, this project might be more usable as cryptography solution of some kind.
from pifs.
Related: Are there any compression algorithms based on PI?
from pifs.
I supposed the author of pifs already had solved this problem (throught an additional bit for example). But really how about some statistics approving pifs viability? I mean ability to compress data.
from pifs.
It can not consequently be stated the :
Increasing the data chunk length leads to increased offset lengths
This is not always the case. You are just as likely to find a large data chunk in the first several digits of pi as you are anywhere in pi.
from pifs.
You are just as likely to find a large data chunk in the first several digits of pi as you are anywhere in pi.
That's not true. If we have some integer cell for a pi offset, it can direct to a limited count of places in pi. In that part of pi we have a big count of small chunks and a smaller count of bigger chunks. And if there are all possible small chunks, not every small chunk have every possible continuation.
from pifs.
It's very true that it's mathematically impossible for a single file to be stored using this method and use less memory than it would now. But, what of a cloud solution? Storing individual bits this way, sort of like rainbow tables, and referencing individual bits--or even long strings of bits--which can be compiled by reference.
Granted, this may or may not have the same speed deficiencies that the current code has, but I believe storage wise, storing this meta data would be less burden.
from pifs.
How about this algorithm: when you know exact pi offset B (bignum), you create its escimated value D (as Long exponent of some big base). So you can reverse convert D to bignum and get D'<B.
Then you take a few first bytes of original sequence and look up if the segment D'B contains this little opening subsequence. If it does then you slightly increase this opening subsequence and try again.
Finally you get D (64 bit Long), length of original sequence and initial subsequence of original sequence. So in worst case, Result = Input + 64 bits + ? bits for length.
from pifs.
@georgy7 I don't think there are less large chunks within a range of pi than their are small chunks.
Consider having n
sequential digits of pi and wanting to determine the number of unique sequential subsets of size x
. At position 0
the first subset is 0
through to position x
, and at position 1
the second subset is 1
through to position x + 1
. Thus at position i
the subset is i
through to position x + i
. Subsets may be made from position 0 through to position n - x
.
Thus when looking for chunks through the first 10000 digits of Pi a chunk size of :
- 16 results in 9984 chunks.
- 32 results in 9968 chunks.
So doubling the chunk size barely reduces the number of potential chunks.
Next consider uniqueness. Larger chunks are more likely to be unique and for this reason I believe their will likely be the same of more unique chunks of a larger size than a lesser.
Experiment :
f = open('PI25K_DP.TXT', 'r')
pi_str = f.read().rstrip()
print(pi_str)
print('digits : {0}'.format(len(pi_str)))
def count_unique_chunks(chunk_size):
set = {}
for i, char in enumerate(pi_str):
if i < len(pi_str) - chunk_size:
chunk = pi_str[i:i + chunk_size]
if chunk not in set:
set[chunk] = True
num_unique_chunks = len(set.keys())
print('{0} | {1}'.format(chunk_size, num_unique_chunks))
return num_unique_chunks
for i in xrange(64):
count_unique_chunks(i)
You may find PI25K_DP.TXT here.
Results :
Chunk Size | Number of Unique Chunks |
---|---|
0 | 1 |
1 | 10 |
2 | 100 |
3 | 1000 |
4 | 9169 |
5 | 22170 |
6 | 24690 |
7 | 24957 |
8 | 24987 |
9 | 24991 |
10 | 24990 |
11 | 24989 |
12 | 24988 |
13 | 24987 |
14 | 24986 |
15 | 24985 |
16 | 24984 |
17 | 24983 |
18 | 24982 |
19 | 24981 |
20 | 24980 |
21 | 24979 |
22 | 24978 |
23 | 24977 |
24 | 24976 |
25 | 24975 |
26 | 24974 |
27 | 24973 |
28 | 24972 |
29 | 24971 |
30 | 24970 |
31 | 24969 |
32 | 24968 |
33 | 24967 |
34 | 24966 |
35 | 24965 |
36 | 24964 |
37 | 24963 |
38 | 24962 |
39 | 24961 |
40 | 24960 |
41 | 24959 |
42 | 24958 |
43 | 24957 |
44 | 24956 |
45 | 24955 |
46 | 24954 |
47 | 24953 |
48 | 24952 |
49 | 24951 |
50 | 24950 |
51 | 24949 |
52 | 24948 |
53 | 24947 |
54 | 24946 |
55 | 24945 |
56 | 24944 |
57 | 24943 |
58 | 24942 |
59 | 24941 |
60 | 24940 |
61 | 24939 |
62 | 24938 |
63 | 24937 |
from pifs.
@dreid93
25K of pi digits contains 24937 chunks of length = 63 decimal digits.
But how many chunks of this length can exist. Its 10^63.
Possibility of first 25K of pi contains a random 63 number sequence = 24937 / 10^63.
from pifs.
Related Issues (20)
- Get to 314 issues HOT 1
- π is a false constant, you should store data in τ instead HOT 11
- Unit test that proves pi is normal HOT 7
- Inception- No saftey net
- Expander? HOT 1
- GDPR Compliance HOT 7
- Misleading/Incorrect claim regarding compression. HOT 15
- Use pifs to store metadata HOT 4
- Viewing the output MetaData file HOT 1
- Maybe have a compression setting?
- ./configure: line 4310: syntax error near unexpected token `FUSE,` HOT 3
- Is there an automated tool to decrypt the metadata? HOT 2
- Copyright HOT 4
- Stop development HOT 3
- Metadata error
- Shorthand encoding for positions
- How does the metadata work, and how can I retrieve my data?
- Support for √2? HOT 1
- The Last Question
- But computer doesn’t have endless memory for pi HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pifs.