twoolie / nbt Goto Github PK
View Code? Open in Web Editor NEWPython Parser/Writer for the NBT file format, and it's container the RegionFile.
License: MIT License
Python Parser/Writer for the NBT file format, and it's container the RegionFile.
License: MIT License
How do I install? Please add instructions to the readme.
I tried pip install nbt
but that seems to have installed a different library with a different API.
AttributeError: 'module' object has no attribute 'NBTFile'
So I'm having this problem, I think the problem comes from me but that seems so illogical, So I'll leave that here hoping someone would help me.
If I replace vill = TAG_List(type = TAG_Compound()
with an other tag type, for example vill = TAG_List(type = TAG_Int())
It will work...
(Please notice that's I'm just a python beginner)
the code:
import nbt
from nbt.nbt import *
villages = NBTFile()
villages.name = "Data"
data = TAG_Compound()
data.tags.extend([
TAG_Int(name="Tick", value=14694297)
])
vill = TAG_List(type = TAG_Compound())
vill.name = "Villages"
villages.tags.append(vill)
villages.tags.append(data)
print(villages.pretty_tree())
The Error message:
Traceback (most recent call last):
File "C:\Users\Tom\Desktop\NBT-master\my files\VillageStacker_1.py", line 12, in <module>
vill = TAG_List(type = TAG_Compound())
File "C:\Python34\lib\site-packages\nbt-1.4.1-py3.4.egg\nbt\nbt.py", line 306, in __init__
raise ValueError("No type specified for list: %s" % (name))
ValueError: No type specified for list: None
Travis is currently configured to execute the tests with Python releases 2.7, 3.3, 3.4, 3.5 and 3.6. The current stable release is 3.7 and should be added to the list. Releases 3.3 and 3.4 are now referenced as archived in the Python documentation, and should be considered for removal.
nbt/__init__.py
contains a version string, but there are no releases associated with it.
This can easily be done on GitHub by just tagging a certain commit. E.g.
git tag release-1.2 ac2b23d3e4ef9e
git push --tags
If you like, I can lookup the appropriate tags and commit hashes.
read the full spec at http://www.minecraft.net/docs/NBT.txt
A copy can be found here. Question is, if it's still valid 8 years later:
https://web.archive.org/web/20100310144708/www.minecraft.net/docs/NBT.txt
If you add enough data to a chunk so that it grows in size (needs another sector), then when you try to write it, the write function (region.py write_chunk) gets stuck in an infinite loop looking for a new place to put the chunk.
It seems like it just keeps reading the first sector of the file over and over.
My hack of a workaround was to just put the chunk at the end of the file since that is usually the first place it would fit anyway.
It seems it is pretty rare for a chunk to need more room. In testing my program that adds data (Populate Chests.py), I probably wrote at least a few thousand chunks by now and a grand total of two needed an extra sector.
What is the code standard for NBT on indentation? Since most files used tabs, that's what I used too. Most users seem to prefer spaces (e.g. 4b7c45e), and I'm fine with that too.
I recently noted that tests.py and some lines in the examples files uses spaces, the rest of the files uses tabs. Whatever the choice, I like to make it consistent. What do you prefer?
Hi,
I'm a noob with github, so I don't know where post it.
I'm using py3k, and I made something like a patch below. It's look like to work
from struct import pack, unpack, calcsize, error as StructError
from gzip import GzipFile
import zlib
# Replace UserDict by dict:
# from UserDict import DictMixin
import os, io
TAG_END = 0
TAG_BYTE = 1
TAG_SHORT = 2
TAG_INT = 3
TAG_LONG = 4
TAG_FLOAT = 5
TAG_DOUBLE = 6
TAG_BYTE_ARRAY = 7
TAG_STRING = 8
TAG_LIST = 9
TAG_COMPOUND = 10
class MalformedFileError(Exception):
"""Exception raised on parse error."""
pass
class TAG(object):
"""Each Tag needs to take a file-like object for reading and writing.
The file object will be initialised by the calling code."""
id = None
def __init__(self, value=None, name=None):
self.name = name
self.value = value
#Parsers and Generators
def _parse_buffer(self, buffer):
raise NotImplementedError(self.__class__.__name__)
def _render_buffer(self, buffer):
raise NotImplementedError(self.__class__.__name__)
#Printing and Formatting of tree
def tag_info(self):
return self.__class__.__name__ + \
('("%s")'%self.name if self.name else "") + \
": " + self.__repr__()
def pretty_tree(self, indent=0):
return ("\t"*indent) + self.tag_info()
class _TAG_Numeric(TAG):
def __init__(self, value=None, name=None, buffer=None):
super(_TAG_Numeric, self).__init__(value, name)
self.size = calcsize(self.fmt)
if buffer:
self._parse_buffer(buffer)
#Parsers and Generators
def _parse_buffer(self, buffer):
self.value = unpack(self.fmt, buffer.read(self.size))[0]
def _render_buffer(self, buffer):
buffer.write(pack(self.fmt, self.value))
#Printing and Formatting of tree
def __repr__(self):
return str(self.value)
#== Value Tags ==#
class TAG_Byte(_TAG_Numeric):
id = TAG_BYTE
fmt = ">b"
class TAG_Short(_TAG_Numeric):
id = TAG_SHORT
fmt = ">h"
class TAG_Int(_TAG_Numeric):
id = TAG_INT
fmt = ">i"
class TAG_Long(_TAG_Numeric):
id = TAG_LONG
fmt = ">q"
class TAG_Float(_TAG_Numeric):
id = TAG_FLOAT
fmt = ">f"
class TAG_Double(_TAG_Numeric):
id = TAG_DOUBLE
fmt = ">d"
class TAG_Byte_Array(TAG):
id = TAG_BYTE_ARRAY
def __init__(self, name=None, buffer=None):
super(TAG_Byte_Array, self).__init__(name=name)
if buffer:
self._parse_buffer(buffer)
#Parsers and Generators
def _parse_buffer(self, buffer):
length = TAG_Int(buffer=buffer)
self.value = buffer.read(length.value)
def _render_buffer(self, buffer):
length = TAG_Int(len(self.value))
length._render_buffer(buffer)
buffer.write(self.value)
#Printing and Formatting of tree
def __repr__(self):
return "[%i bytes]" % len(self.value)
class TAG_String(TAG):
id = TAG_STRING
def __init__(self, value=None, name=None, buffer=None):
super(TAG_String, self).__init__(value, name)
if buffer:
self._parse_buffer(buffer)
#Parsers and Generators
def _parse_buffer(self, buffer):
length = TAG_Short(buffer=buffer)
read = buffer.read(length.value)
if len(read) != length.value:
raise StructError()
self.value = str(read, "utf-8")
def _render_buffer(self, buffer):
save_val = self.value.encode("utf-8")
length = TAG_Short(len(save_val))
length._render_buffer(buffer)
buffer.write(save_val)
#Printing and Formatting of tree
def __repr__(self):
return self.value
#== Collection Tags ==#
class TAG_List(TAG):
id = TAG_LIST
def __init__(self, type=None, value=None, name=None, buffer=None):
super(TAG_List, self).__init__(value, name)
if type:
self.tagID = type.id
else: self.tagID = None
self.tags = []
if buffer:
self._parse_buffer(buffer)
if not self.tagID:
raise ValueError("No type specified for list")
#Parsers and Generators
def _parse_buffer(self, buffer):
self.tagID = TAG_Byte(buffer=buffer).value
self.tags = []
length = TAG_Int(buffer=buffer)
for x in range(length.value):
self.tags.append(TAGLIST[self.tagID](buffer=buffer))
def _render_buffer(self, buffer):
TAG_Byte(self.tagID)._render_buffer(buffer)
length = TAG_Int(len(self.tags))
length._render_buffer(buffer)
for i, tag in enumerate(self.tags):
if tag.id != self.tagID:
raise ValueError("List element %d(%s) has type %d != container type %d" %
(i, tag, tag.id, self.tagID))
tag._render_buffer(buffer)
#Printing and Formatting of tree
def __repr__(self):
return "%i entries of type %s" % (len(self.tags), TAGLIST[self.tagID].__name__)
def pretty_tree(self, indent=0):
output = [super(TAG_List,self).pretty_tree(indent)]
if len(self.tags):
output.append(("\t"*indent) + "{")
output.extend([tag.pretty_tree(indent+1) for tag in self.tags])
output.append(("\t"*indent) + "}")
return '\n'.join(output)
class TAG_Compound(TAG, dict):
id = TAG_COMPOUND
def __init__(self, buffer=None):
super(TAG_Compound, self).__init__()
self.tags = []
self.name = ""
if buffer:
self._parse_buffer(buffer)
#Parsers and Generators
def _parse_buffer(self, buffer):
while True:
type = TAG_Byte(buffer=buffer)
if type.value == TAG_END:
#print "found tag_end"
break
else:
name = TAG_String(buffer=buffer).value
try:
#DEBUG print type, name
tag = TAGLIST[type.value](buffer=buffer)
tag.name = name
self.tags.append(tag)
except KeyError:
raise ValueError("Unrecognised tag type")
def _render_buffer(self, buffer):
for tag in self.tags:
TAG_Byte(tag.id)._render_buffer(buffer)
TAG_String(tag.name)._render_buffer(buffer)
tag._render_buffer(buffer)
buffer.write('\x00') #write TAG_END
# Dict compatibility.
# DictMixin requires at least __getitem__, and for more functionality,
# __setitem__, __delitem__, and keys.
def __getitem__(self, key):
if isinstance(key,int):
return self.tags[key]
elif isinstance(key, str):
for tag in self.tags:
if tag.name == key:
return tag
else:
raise KeyError("A tag with this name does not exist")
else:
raise ValueError("key needs to be either name of tag, or index of tag")
def __setitem__(self, key, value):
if isinstance(key, int):
# Just try it. The proper error will be raised if it doesn't work.
self.tags[key] = value
elif isinstance(key, str):
value.name = key
for i, tag in enumerate(self.tags):
if tag.name == key:
self.tags[i] = value
return
self.tags.append(value)
def __delitem__(self, key):
if isinstance(key, int):
self.tags = self.tags[:key] + self.tags[key:]
elif isinstance(key, str):
for i, tag in enumerate(self.tags):
if tag.name == key:
self.tags = self.tags[:i] + self.tags[i:]
return
raise KeyError("A tag with this name does not exist")
else:
raise ValueError("key needs to be either name of tag, or index of tag")
def keys(self):
return [tag.name for tag in self.tags]
#Printing and Formatting of tree
def __repr__(self):
return '%i Entries' % len(self.tags)
def pretty_tree(self, indent=0):
output = [super(TAG_Compound,self).pretty_tree(indent)]
if len(self.tags):
output.append(("\t"*indent) + "{")
output.extend([tag.pretty_tree(indent+1) for tag in self.tags])
output.append(("\t"*indent) + "}")
return '\n'.join(output)
TAGLIST = {TAG_BYTE:TAG_Byte, TAG_SHORT:TAG_Short, TAG_INT:TAG_Int, TAG_LONG:TAG_Long, TAG_FLOAT:TAG_Float, TAG_DOUBLE:TAG_Double, TAG_BYTE_ARRAY:TAG_Byte_Array, TAG_STRING:TAG_String, TAG_LIST:TAG_List, TAG_COMPOUND:TAG_Compound}
class NBTFile(TAG_Compound):
"""Represents an NBT file object"""
def __init__(self, filename=None, buffer=None, fileobj=None):
super(NBTFile,self).__init__()
self.__class__.__name__ = "TAG_Compound"
self.filename = filename
self.type = TAG_Byte(self.id)
#make a file object
if filename:
self.file = GzipFile(filename, 'rb')
elif buffer:
self.file = buffer
elif fileobj:
self.file = GzipFile(fileobj=fileobj)
else:
self.file = None
#parse the file given intitially
if self.file:
self.parse_file()
if self.filename and 'close' in dir(self.file):
self.file.close()
self.file = None
def parse_file(self, filename=None, buffer=None, fileobj=None):
if filename:
self.file = GzipFile(filename, 'rb')
elif buffer:
self.file = buffer
elif fileobj:
self.file = GzipFile(fileobj=fileobj)
if self.file:
try:
type = TAG_Byte(buffer=self.file)
if type.value == self.id:
name = TAG_String(buffer=self.file).value
self._parse_buffer(self.file)
self.name = name
self.file.close()
else:
raise MalformedFileError("First record is not a Compound Tag")
except StructError as e:
raise MalformedFileError("Partial File Parse: file possibly truncated.")
else: ValueError("need a file!")
def write_file(self, filename=None, buffer=None, fileobj=None):
if buffer:
self.filename = None
self.file = buffer
elif filename:
self.filename = filename
self.file = GzipFile(filename, "wb")
elif fileobj:
self.filename = None
self.file = GzipFile(fileobj=fileobj, mode="wb")
elif self.filename:
self.file = GzipFile(self.filename, "wb")
elif not self.file:
raise ValueError("Need to specify either a filename or a file")
#Render tree to file
TAG_Byte(self.id)._render_buffer(self.file)
TAG_String(self.name)._render_buffer(self.file)
self._render_buffer(self.file)
#make sure the file is complete
if 'flush' in dir(self.file):
self.file.flush()
if self.filename and 'close' in dir(self.file):
self.file.close()
(sorry for my bad english)
Most files in the doc folder are rather empty. I haven't find a big need for it, but perhaps we should either remove the empty files or fill them :)
Proposal: use NumPy if available, otherwise the native Python version.
Two uses: to speed up 4-bit array <-> byte array conversions and XZY <-> YZX array conversion.
This does requires a test function before it can be written IMHO.
type 11 is an array of TAG_Int's, analogous to the byte array
http://minecraft.gamepedia.com/NBT_format
I had a look at the code but I'm afraid I literally can't follow it :D
When you try the example anvil_blockdata.py with a existing data array in a *.mca file, the following error raise in the print_chunklayer function:
ValueError: byte must be in range(0, 256)
Cause when a data array exist, the blocks array should be a short integer and not a bytearray anymore.
I fix it like this (do not forget to import array):
def print_chunklayer(blocks, data, add, yoffset):
blocks = array.array('h',blocks[yoffset*256:(yoffset+1)*256])
data = array_4bit_to_byte(data[yoffset*128:(yoffset+1)*128])
if add is not None:
add = array_4bit_to_byte(add[yoffset*128:(yoffset+1)*128])
for i,v in enumerate(add):
blocks[i] += 256*v
assert len(blocks) == 256
assert len(data) == 256
for row in grouper(zip(blocks,data), 16):
print (" ".join(("%4d:%-2d" % block) for block in row))
Is anyone working already on this? If not I'll start it soon.
Not all NBT files are gzipped. Minecraft uses two NBT's, without compression: idcounts.dat and servers.dat. More info at Nbt#Uses.
Is there a way to currently parse a NBT without uncompressing it? I get this error trying to open servers.dat with nbt.nbt.NBTFile
:
>>> os.getcwd()
'/Users/winston/Library/Application Support/minecraft'
>>> os.listdir()
['.DS_Store', 'assets', 'launcher.jar', 'launcher.pack.lzma', 'launcher_profiles.json', 'libraries', 'logs', 'options.txt', 'output-client.log', 'resourcepacks', 'saves', 'screenshots', 'servers.dat', 'stats', 'textures_0.png', 'versions']
>>> nbt.VERSION
(1, 4, 1)
>>> serversnbt = nbt.nbt.NBTFile('servers.dat', 'rb')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/site-packages/nbt/nbt.py", line 508, in __init__
self.parse_file()
File "/usr/local/lib/python3.4/site-packages/nbt/nbt.py", line 532, in parse_file
type = TAG_Byte(buffer=self.file)
File "/usr/local/lib/python3.4/site-packages/nbt/nbt.py", line 85, in __init__
self._parse_buffer(buffer)
File "/usr/local/lib/python3.4/site-packages/nbt/nbt.py", line 90, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/gzip.py", line 365, in read
if not self._read(readsize):
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/gzip.py", line 433, in _read
if not self._read_gzip_header():
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/gzip.py", line 297, in _read_gzip_header
raise OSError('Not a gzipped file')
OSError: Not a gzipped file
As stated on line 178, "map still only supports McRegion maps." Since this is part of the test suite, can this be updated to support Anvil?
A small annoyance (to me). The parameters of the __init__
functions in nbt.TAG classes are very inconsistent. May I fix this?
Are there preferences for proposal 1 or proposal 2?
__init__
methodsclass TAG(object):
def __init__(self, value=None, name=None):
class _TAG_Numeric(TAG):
def __init__(self, value=None, name=None, buffer=None):
class TAG_Byte_Array(TAG, MutableSequence):
def __init__(self, name=None, buffer=None):
class TAG_Int_Array(TAG, MutableSequence):
def __init__(self, name=None, buffer=None):
class TAG_String(TAG, Sequence):
def __init__(self, value=None, name=None, buffer=None):
class TAG_List(TAG, MutableSequence):
def __init__(self, type=None, value=None, name=None, buffer=None):
class TAG_Compound(TAG, MutableMapping):
def __init__(self, buffer=None):
class TAG(object):
def __init__(self, value=None, name=None, buffer=None):
class _TAG_Numeric(TAG):
def __init__(self, value=None, name=None, buffer=None):
class TAG_Byte_Array(TAG, MutableSequence):
def __init__(self, value=None, name=None, buffer=None):
class TAG_Int_Array(TAG, MutableSequence):
def __init__(self, value=None, name=None, buffer=None):
class TAG_String(TAG, Sequence):
def __init__(self, value=None, name=None, buffer=None):
class TAG_List(TAG, MutableSequence):
def __init__(self, type=None, value=None, name=None, buffer=None):
class TAG_Compound(TAG, MutableMapping):
def __init__(self, value=None, name=None, buffer=None):
class TAG(object):
def __init__(self, name=None, value=None, buffer=None):
class _TAG_Numeric(TAG):
def __init__(self, name=None, value=None, buffer=None):
class TAG_Byte_Array(TAG, MutableSequence):
def __init__(self, name=None, value=None, buffer=None):
class TAG_Int_Array(TAG, MutableSequence):
def __init__(self, name=None, value=None, buffer=None):
class TAG_String(TAG, Sequence):
def __init__(self, name=None, value=None, buffer=None):
class TAG_List(TAG, MutableSequence):
def __init__(self, name=None, type=None, value=None, buffer=None):
class TAG_Compound(TAG, MutableMapping):
def __init__(self, name=None, value=None, buffer=None):
I'm trying to write a program that iterates over all the blocks in a world and checks compiles some statistics. But I can't figure how to get all the block data from a chunk. The block_analysis example appears to only work with the old map format.
Could someone give me an example or project that does something like this?
Hi twoolie, thanks for setting up the travis test service; I'm new to it, and it looks promising.
I have two questions on the test infrastructure.
/usr/share/doc
at the discretion of the package manager).I'm asking the second question because the setup now explicitly treats tests.py as a module, by including it import tests
. However, it is moved to a subfolder, from tests import tests
no longer works unless we turn this into a module (e.g. by creating tests/__init__.py
). I'm hesitant -- I rather treat it as a collection of test scripts, but not part of the actual module.
This brings up a third question. My current pull request #29 breaks Travis; I'm currently writing a patch for that, hence the questsions above.
Sorry for these questions; I'm trying to find out the way that is easiest for you to pull, having multiple pull requests piling does make it harder for me to get it right for you (and to be honest, I'm not very familiar with best practices in testing and pulling, so tips are welcome!)
The current nbt module stores the value of a TAG_Byte_Array as string. This has a few problems. It is not mutable, it is slow, it requires clumsy code using struct.unpack and struct.pack, and it may give problems when converting to Python 3 (since all strings are Unicode in Python 3).
I propose to store it as a bytearray, which is a native Python object for, well, a byte array. This is different from a byte object in that it is mutable.
The code change in nbt.py itself is rather simple, see commit 0a7000f.
However, it also requires changes in code that assumes it is a string, such as chunk.py. See c0aa823. Fixing chunk.py is easy enough, but this may also break other code that relies on this internal structure of TAG_Byte_Array.
What do you think? I'm very much in favour of changing it for speed, Python 3 compatibility and easy-of-use.
Let me demonstrate the easy of use with an example. This is the current code required to modify an item in the HeightMap:
heightBytes = mynbt['Level']['HeightMap'].value
heightdata = list(struct.unpack("256B", heightBytes))
heightdata[3] = 63
heightBytes = array.array('B', heightdata).tostring()
mynbt['Level']['HeightMap'].value = heightBytes
If the value parameter is mutable, it could greatly simplifies the above code. There is no need to pack or unpack anymore:
mynbt['Level']['HeightMap'].value[3] = 63
In fact, it is even trivial to add a __getitem__
and __setitem__
function to TAG_Byte_Array (see e8cd308), so one can simply write:
mynbt['Level']['HeightMap'][3] = 63
In my opinion, this is MUCH cleaner than the current code, and where I like NBT to head to.
My question to all users of NBT is: is the above advantage worth the downside that it will likely break code that still expects TAG_Byte_Array().value
to be string? struct.unpack('256b', my_tag_byte_array.value)
will raise a struct.error
, although array.array('B', my_tag_byte_array.value)
will continue to work just fine (although setting my_tag_byte_array.value to a string using array.array().tostring()
may fail things in nbt.py)
As an added bonus, it is a bit faster too (it is actually 600 times faster, but that's negated by the slow method to parse the NBT). Here are some timing measurements:
current code: unpack (returns a tuple) and convert to list
mynbt = nbt.NBTFile(filename) 2500 µs excluding actual I/O
blocksBytes = mynbt['Level']['Blocks'].value 6.7 µs
blocktuple = struct.unpack("32768B", blocksBytes) 350 µs
blockdata = list(blocktuple) 180 µs
list: a slow alternative
mynbt = nbt.NBTFile(filename) 2500 µs excluding actual I/O
blocksBytes = mynbt['Level']['Blocks'].value 6.7 µs
blockdata = [i for i in blocksBytes] 2300 µs
bytes: an immutable byte sequence. blazingly fast.
mynbt = nbt.NBTFile(filename) 2500 µs excluding actual I/O
blocksBytes = mynbt['Level']['Blocks'].value 6.7 µs
blockdata = bytes(blocksBytes) 0.2 µs
bytearray: my proposal
mynbt = nbt.NBTFile(filename) 2500 µs excluding actual I/O
blocksBytes = mynbt['Level']['Blocks'].value 6.7 µs
blockdata = bytearray(blocksBytes) 4.2 µs
array: an equally fast alternative.
mynbt = nbt.NBTFile(filename) 2500 µs excluding actual I/O
blocksBytes = mynbt['Level']['Blocks'].value 6.7 µs
blockdata = array.array('B', blocksBytes) 4.4 µs
nbt.chunk is capable of settings blocks, but does not update the light levels in the nbt file as well.
Edit: I consider this a small issue, and perhaps it should not be implemented by the NBT library, as it is very Minecraft specific, and applications that use NBT can implement this as well.
Hi everyone ...
I want to create new servers.dat file or insert server name and server ip to servers.dat file ...
I read my servers.dat but i cant write
Pls help me ..
Most Minecraft tools need some database for block IDs/data IDs, or even Biome IDs, entity IDs or item IDs.
Would this or would this not be part of NBT?
On one hand I'm not in favour of maintaing that list, and like NBT to be somewhat more generic than Minecraft-only (though I currently see little use of the NBT outside of the Minecraft community).
On the other hand, there is demand for such list, and in such cases a central repository is useful. With the easy collaboration in GitHub, I also hope to see some external input (so we the NBT maintainers don't need to maintain it themselves).
It made an attempt earlier at such name module in my experimental branch. I'm not satisfied yet. While it is extendable by external tools, it lacks any non-English support. Maintaining a non-English data list is out-of-scope in my opinion, but whatever is created should have hooks for others to easily plug in the non-English names.
However, before we even consider that, the first question is: is a block ID database in scope of NBT or not?
Erm, I did some file renames, but never changed the Manifest.
I'm a little cautious to touch it myself because I'm uncertain about the correct syntax.
E.g. the current file contains
include *.txt
include examples
but the description at http://wiki.python.org/moin/Distutils/Tutorial seems to indicate it should be
include *.txt
recursive-include examples
For convenience, it should be possible to create an NBTFile
using nbt.nbt.NBTFile(path)
, where path
is a pathlib.Path
object.
There is a missing parenthesis at the end of the file, and the function parse_header does not exist.
The classmethod getchunk is a stub, and would need a class as first argument.
Upon removing the parse_header function, the code continues, but NBT reports the specified file is not a gzipped file later on.
I think it is inevitable that we need to a test world to NBT. I used a Anvil and Region-chunk in a local branch, but that does not do for testing nbt.region and nbt.world.
Such a testfile is hughe (about 6 MByte) compared to the code. It may be possible to trim it, but that requires an editor, e.g. NBT. And I'm not yet confident enough that NBT makes no mistake during that editing (that's why we need the test file!)
The question is: is 6 MB extra download a problem, or not?
@twoolie, others: please your opinion.
If a API change will be made for 2.0, I propose that NBTFile is not longer a subclass of TAG_Compound.
Tag_Compound already does two things: it is a Named OrderedDict object on one hand and has a specific serialisation on the other hand. NBTFile adds a third and fourth thing to it: an optional compression, and file read/write.
I suggest that the object and serialisation belongs in nbt
, but the compression and file read/write does not. You will notice that region creates NBTFile objects, but completely bypasses the later two functions. Obviously for the read/write, but also for the compression: it uses zlib instead of GZip.
Finally, the many possible invocations cry for factory methods (@staticmethod
). Or perhaps it's just me won't can't remember the difference between the filename, buffer and filobj parameters.
nbt.chunk.generate_heightmap() currently generated a height map that returns the highest non-air block. An examination of some minecraft files shows that the nbt['Level']['HeightMap'] contains the highest non-solid block. At least, it seems to ignore grass and flowers.
The current Travis automated tests currently fail for Python 2.6, pypy, as well as Python 2.7 if you are using 2.7.8. or earlier.
The reason seems rather mundane: the tests download the file https://github.com/downloads/twoolie/NBT/Sample_World.tar.gz. This file is not included in the standard distribution due to its size.
HTTPS support in Python 2 is very bad. In this particular case, it tries to download this file using the SSLv3 protocol, which was righteously disabled by github.com after the recent SSL Poodle vulnerability.
Python up to 2.7.8 only seems to try SSLv3, and does not use TLSv1.0. So it fails with an error:
>>> import urllib2
>>> request = urllib2.Request("https://github.com/downloads/twoolie/NBT/Sample_World.tar.gz")
>>> remotefile = urllib.urlopen(request)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:493: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>
Now, this is a bit surprising, as Python 2.6 - 2.7.8 does support TLSv1.0 in the ssl module. However, the urllib2 module (which NBT uses) uses the httplib module, and the httplib module uses the ssl modules, but httplib does not expose this functionality to urllib2 (or to NBT). The situation is actually worse: up to Python 2.7.8, no certificate was ever checked for validity, SNI is not supported, and there are probably more issues if I would really dive into it. [Edit: up to Python 3.3, no certificate was ever checked for validity, see PEP 476].
The situation was actually so bad that the Python core team decided to do the only reasonable thing: they added new features to Python 2.7, and Python 2.7.9 was specifically released to backport the ssl, httplib, urllib2 and ensureip modules from Python 3 to Python 2. The release notes talk about "several significant changes unprecedented in a bugfix release".
Now this is a bit of a pickle: while the NBT code itself should still work, it currently can not be checked on Python 2.6 - 2.7.8. I propose the following:
Any feedback, positive and negative, is highly appreciated.
If the feedback is mostly in agreement, I will to create a version 1.4.2 (which should include patches for #76 and #77), with a note that it is the last version that supports Python 2.6 - 2.7.8.
Running tests.py with a fresh commit changes bigtest.nbt. The file size changes from 507 bytes to 526 bytes. I don't know what's causing this, but I suggest to change tests.py to work on a copy or so.
@Fenixin Thanks for your continued bug fixes. I haven't played Minecraft in a while and hence haven't contributed much to this project. I presume the same holds for @twoolie.
I just pushed a few bug-fixes that I had lying around for an extended period of time, but just never pushed because they weren't finished at the time. It's mostly adding more self-tests, and updated documentation. At least Travis is happy again :).
I also added some comments and "TODO"'s in nbt/region.py at the time. However, I'm still not very familiar with the region format. In case you want to have a look at what code I marked for further investigation, I just pushed it to a temporary branch: macfreek/NBT@4017af1. While most are probably just things that were unclear to me, some may the (potential) bugs.
I propose to update the Chunk API with the following changes:
Here is my refined proposal from an earlier attempt
class BaseChunk(object):
"""Abstract Chunk class."""
class McRegionChunk(BaseChunk):
"""Representation of a Chunk in McRegion format."""
class AnvilChunk(BaseChunk):
"""Representation of a Chunk in Anvil format."""
def __init__(self, nbt):
# self.nbt is a pointer to the NBT TAG_Compound instance
self.nbt = nbt
# self.blocks is deprecated: it used to refer to a BlockArray instance.
# It's functionality (and all it's variables) are now present in McRegionChunk.
# It is present for backward compatibility.
self.blocks = self
# self.blockdata is a flat arrays of (block id, data) tuples in native order
# In McRegion the length is 32768 and order is XZY, thus index = (x*16)+z)*16+y
# In Anvil the length is n*4096 and order is YZZ, thus index = (y*16)+z)*16+x,
# with n = 0...16 (thus lenght 0...65536)
# blockdata[i] = (256*addblocks[i] + blocks[i], data[i])
self.blockdata = []
# self.update_callbacks is an array of functions that are called just before
# writing the NBT file. By convention, the first function updates the heightmap
# and the second function updates the lightlevels.
# A callback function should take one parameter, a BaseChunk class.
# These functions is called right after self.update_block_nbt()"""
self.update_callbacks = [ update_heightmap, update_lightlevels ]
# The following instance variables have been removed:
# self.coords
# self.blocksList
# self.dataList
Methods:
#
# Metadata
#
def get_coords(self):
"""Return the x,z coordinates of the chunk. Multiply by 16 to get the global block coordinates."""
"""Unmodified."""
#
# Heightmap and Light level functions
#
def generate_heightmap(self, buffer=False, as_array=False): # McRegion
def generate_heightmap(self): # Anvil
"""McRegion: Returns a bytearray containing the highest solid block. """
"""Anvil: Returns a list containing the highest solid block. """
"""Changed: buffer and as_array boolean parameters are only present in McRegion and
removed from Anvil.
If buffer is True, the result is converted to a io.BytesIO instance.
as_array is ignored (was: result converted to a array.array instance.)
Reason for removal in Anvil is that the heightmap are now ints instead of bytes,
and these encoding conversions do no belong in chunk.py"""
def set_heightmap_callback(self, callback):
"""Set the callback function that is used to calculate the heightmap from the blockdata.
The callback function should take one parameter, a BaseChunk class.
The callback function is called right after self.update_block_nbt()"""
"""New"""
def set_lightlevel_callback(self, callback):
"""Set the callback function that is used to calculate the light levels from the blockdata.
The callback function should take one parameter, a BaseChunk class.
The callback function is called right after self.update_block_nbt()"""
"""New"""
#
# NBT functions
#
def parse_blocks(self):
"""Read NBT and fill self.blockdata, based on Blocks, Data, and AddBlocks in NBT file"""
"""Changed: now uses self.blockdata instead of self.blocksList and self.dataList"""
def update_block_nbt(self):
"""McRegion: Set self.nbt['Level']['Blocks'] and self.nbt['Level']['Data']
based on self.blockdata """
"""Ǎnvil: Set self.nbt['Level']['Sections'][i]['Blocks'],
self.nbt['Level']['Sections'][i]['AddBlocks'] (if required)
and self.nbt['Level']['Sections'][i]['Data'] based on self.blockdata"""
"""New"""
def update_nbt(self):
"""Update self.nbt based on self.blockdata (including heightmap and light levels).
This calls update_block_nbt() and the callback functions in order"""
"""New"""
def get_nbt(self):
"""Update the nbt (if block data was changed) and return it"""
"""New"""
#
# Block retrieval functions
#
def get_block(self, x, y, z):
"""Return the block id of the block at the x,y,z coordinates relative to this chunk"""
"""Changed: coord parameter removed for speed"""
def get_data(self, x, y, z):
"""Return the data id of the block at the x,y,z coordinates relative to this chunk"""
"""Changed: coord parameter removed for speed"""
def get_block_and_data(self, x, y, z):
"""Return a tuple (block id, data id) of the block at the x,y,z coordinates relative to
this chunk"""
"""Changed: coord parameter removed for speed"""
def get_all_blocks(self):
"""Iterate over all block ids, including all air blocks.
For more efficiency, use get_defined_block_ids()"""
"""Unmodified"""
def get_all_blocks_and_data(self):
"""Iterate over (block id, data) tuples, including undefined (air) blocks"""
"""Unmodified"""
def get_defined_blocks(self):
"""Iterate over all defined block ids, excluding air blocks in undefined sections"""
"""New"""
def get_defined_blocks_and_data(self):
"""Iterate over all defined (block id, data) tuples, excluding air blocks in undefined sections"""
"""New"""
#
# Structured block functions and block setting functions
#
def set_block(self, x,y,z, id, data=0):
"""Set the block to specified id and data value."""
"""Unmodified"""
def set_all_blocks_and_data(self, list):
"""McRegion: Replace all blocks with the given (block id, data) tuples.
All blocks should be specified in a flat list of 32768 entries in native XZY order."""
"""Anvil: Replace all blocks with the given (block id, data) tuples.
All blocks should be specified in a flat list of a multiple of 4096 entries in native YZX order.
If the list is smaller than 65536, the remaining blocks are zeroed (set to air)"""
"""WARNING: this function behaves slightly different for McRegion and Anvil"""
"""New"""
def set_blocks(self, dict=None, fill_air=False):
"""Replace blocks with specificied (x,y,z) coordinates. Each item is a (block id,data) tuple
WARNING: the syntax of this function has changed; for lists, use set_all_blocks_and_data().
It also now requires a (block id, data) tuple"""
"""Changed"""
#
# Deprecated and Removed functions
#
def get_blocks_struct(self):
"""Return a dict with defined (x,y,z): (block id, data) tuples. air blocks in undefined
sections may be excluded."""
"""Unmodified. Deprecated. May be removed in a future version."""
def get_blocks_byte_array(self, buffer=False):
"""Return blockList as a byte array"""
"""Removed"""
raise NotImplementedError("Use get_all_blocks() instead of get_blocks_byte_array()")
def get_all_data(self):
"""Return dataList as a list"""
"""Removed"""
raise NotImplementedError("Use zip(*self.get_all_blocks_and_data())[1] instead of get_all_data()")
def get_data_byte_array(self, buffer=False):
"""Return dataList as a byte array"""
"""Removed"""
raise NotImplementedError()
#
# Biome functions
#
def get_biome(self, x, z):
"""Return the biome IDs at the specified x,z coordinated (relative to this chunk).
An ID of 255 means "Undetermined"."""
"""New"""
def get_biomes(self):
"""Return a list of biome IDs. The list if a flat array of integers in ZX order.
An ID of 255 means "Undetermined"."""
"""New"""
# self.biomes is a flat array of integers (0-255) in ZX order. (i = (z * 16 + x))
# Since biome IDs are not stored in McRegion format, it always set to 255 ("undetermined")
#
# Section functions. These methods are only available for Anvil.
#
def get_section_blocks_and_data(self, height):
"""Return a list of 4096 (block id, data) tuples in YZX order, for the given section height.
The section height is 1/16 of the block height."""
"""New. Anvil only"""
def iter_defined_sections(self):
"""Iterate over each defined section in order from lowest to highest:
Each iteration yields a TAG_Compound defining the section.
Check ['Y']*16 to get the height of the section.
This is a reasonably fast routine. For even greater speed, don't use a Chunk class and
iterate the NBT yourself."""
"""New. Anvil only"""
def get_max_section_height(self):
"""Return the height of the highest defined section. Multiple by 16 and add 15 to get the upper
boundary for the height of the highest defined block. This method is only available for Anvil."""
"""New. Anvil only"""
#
# Height map routines
#
def get_min_floor_height(self):
"""Return the height of the lowest solid block reachable by sunlight in this chunk."""
"""New. Experimental. Function may be removed in later versions"""
def get_max_floor_height(self):
"""Return the height of the highest solid block in this chunk."""
"""New. Experimental. Function may be removed in later versions"""
def get_min_defined_block_height(self):
"""Return the height of the lowest non-air block in this chunk. Usually 0."""
"""New. Experimental. Function may be removed in later versions"""
def get_max_defined_block_height(self):
"""Return the highest non-air block. This is a slow routine. For a fast alternative,
use get_max_floor_height()
for the lower boundary and 16*get_max_section_height()+15 for the upper boundary()."""
"""New. Experimental. Function may be removed in later versions"""
def update_heightmap(chunk):
"""Set nbt['Level']['HeightMap'] based on self.blockdata"""
def update_lightlevels(self):
"""McRegion: Set nbt['Level']['SkyLight'] and nbt['Level']['BlockLight'] based on
self.blockdata"""
"""Anvil: Set self.sections[i]['SkyLight'] and self.sections[i]['BlockLight'] based
on self.blockdata"""
Exactly as the title describes. I already have a test; I'll send a pull request as soon as I have a fix.
As to how I came across this... Beta's trying to grow support for saving NBT data back to disk, and there are a couple warts, such as whether there's a pre-existing file that needs to be saved over. Since NBT doesn't support mmaps or other magical writeback mechanisms, it'd be nice to at least not die on this kind of corner case.
I've been using the example scripts on save files from the latest version, 1.12.2, with mixed results. I'm not sure yet if the problems are in the main library or just the examples themselves. Summary of what I've found so far:
-biome_analysis.py seems to work on these save files, correctly enumerates biome types.
-mob_analysis.py seems to work, too, listing every mob in the world folder.
-regionfile_analysis.py doesn't work, gives errors like:
"chunk 2,1 is not a valid NBT file: outer object is not a TAG_Compound, but '\n'"
for each chunk in the region file. Digging into this indicates that it is successfully finding the chunks in the region file, but not successfully parsing them.
-block_analysis.py reports:
"0 total blocks in region, 0 are non-air (0.0000%)"
for all of the save files I've produced using version 1.12.2. Which seems to indicate that it's not finding anything at all.
I'm going to look into the underlying reasons more, but it looks like the major difference between biome_analysis and regionfile_analysis is that the former uses the library heavily, while the latter duplicates a lot of the work in order to get a more detailed view. Hopefully that means the library is still fully compatible with modern saves and we just need to update some of the examples.
Line 152:
def _parse_buffer(self, buffer:
This is missing an end bracket, and should be:
def _parse_buffer(self, buffer):
Twoolie recently accepted my pull request #31, which modifies the output of str()
, repr()
, pretty_tree()
and tag_info()
of NBT Tag objects, but raised some concern:
the whole point of
pretty_tree
is to do this sort of "meaningful" printing. I'd feel more comfortable if you kept the output ofpretty_tree
the same. perhaps move the old logic over tostr
if you are going to overriderepr
?
The reason to make this change was because I was confused by the following output:
>>> f = nbt.NBTFile("bigtest.nbt")
>>> print(f)
11 Entries
>>> print(repr(f))
11 Entries
It left me wondering "what type of object is f? And what are these 11 entries?"
Given the concern raised above, let me poll here what the preferred output is.
TAG objects have four methods that return a string:
__str__()
__repr__()
tag_info()
pretty_tree()
Here's a quick side-by-side comparison of the changes in output. The full output of pretty_tree() is listed at the bottom of this post.
repr() before commit #31 | repr() after commit #31 |
---|---|
|
|
str() before commit #31 | str() after commit #31 |
|
|
tag_info() before commit #31 | tag_info() after commit #31 |
|
|
pretty_tree() before commit #31 | pretty_tree() after commit #31 |
|
|
Observe that the output of tag_info()
and pretty_tree()
is only slightly changed.
TAG_Int_Array.pretty_tree()
now mimics TAG_Byte_Array.pretty_tree()
The most important changes are to str()
and repr()
.
The Python manual has the following requirements for the str() and repr() functions:
repr: should return a string that is acceptable to eval(). If this is not possible, a string enclosed in angle brackets that contains the name of the type of the object the address of the object.
str: return an “informal” string representation of an object. The return value must be a string object (apparently a byte string for Python 2 and a Unicode string for Python 3)
__repr__()
?11 Entries
(previous result; goes against Python guidelines on __repr__
)<TAG_Compound('Level') at 0x10e9a5fd0>
(current solution)TAG_Compound(name='Level', value=[ TAG_Long(name='longTest', value=9223372036854775807), TAG_Short(name='shortTest', value=32767), TAG_String(name='stringTest', value=u'HELLO WORLD THIS IS A TEST STRING ÅÄÖ!'), TAG_Float(name='floatTest', value=0.4982314705848694), TAG_Int(name='intTest', value=2147483647), TAG_Compound(name='nested compound test', value=[ TAG_Compound(name='ham', value=[ TAG_String(name='name', value='Hampus'), TAG_Float(name='value', value=0.75) ]) TAG_Compound(name='egg',value=[ TAG_String(name='name', value='Eggbert'), TAG_Float(name='value', value=0.5) ]) ]) TAG_List(name='listTest (long)', value=[ TAG_Long(value=11), TAG_Long(value=12), TAG_Long(value=13), TAG_Long(value=14), TAG_Long(value=15) ]) TAG_List(name='listTest (compound)', value=[ TAG_Compound(value=[ TAG_String(name='name', value='Compound tag #0'), TAG_Long(name='created-on', value=1264099775885) ]) TAG_Compound(value=[ TAG_String(name='name', value='Compound tag #1'), TAG_Long(name='created-on', value=1264099775885) ]) ]) TAG_Byte('byteTest', value=127), TAG_Byte_Array(name='byteArrayTest (the first 1000 values of (n*n*255+n*7)%100, starting with n=0 (0, 62, 34, 16, 8, ...))', value=[0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48, 0, 62, 34, 16, 8, 10, 22, 44, 76, 18, 70, 32, 4, 86, 78, 80, 92, 14, 46, 88, 40, 2, 74, 56, 48, 50, 62, 84, 16, 58, 10, 72, 44, 26, 18, 20, 32, 54, 86, 28, 80, 42, 14, 96, 88, 90, 2, 24, 56, 98, 50, 12, 84, 66, 58, 60, 72, 94, 26, 68, 20, 82, 54, 36, 28, 30, 42, 64, 96, 38, 90, 52, 24, 6, 98, 0, 12, 34, 66, 8, 60, 22, 94, 76, 68, 70, 82, 4, 36, 78, 30, 92, 64, 46, 38, 40, 52, 74, 6, 48]), TAG_Double(name='doubleTest', value=0.4931287132182315) ]) ])
For single value entities, like TAG_Numeric and TAG_String, I think it should
just return str(self.value). For collection TAGs, it can be either one of:
a. 11 Entries
(previous result)
b. TAG_Compound("Level"): 11 Entries
(same as tag_info)
c. non-recursive nesting of all entries (current solution): {TAG_Long('longTest'): 9223372036854775807, TAG_Short('shortTest'): 32767, TAG_String('stringTest'): HELLO WORLD THIS IS A TEST STRING ÅÄÖ!, TAG_Float('floatTest'): 0.4982314705848694, TAG_Int('intTest'): 2147483647, TAG_Compound('nested compound test'): {2 Entries}, TAG_List('listTest (long)'): [5 TAG_Long(s)], TAG_List('listTest (compound)'): [2 TAG_Compound(s)], TAG_Byte('byteTest'): 127, TAG_Byte_Array('byteArrayTest (the first 1000 values of (n*n*255+n*7)%100, starting with n=0 (0, 62, 34, 16, 8, ...))'): [1000 byte(s)], TAG_Double('doubleTest'): 0.4931287132182315}
d. As c, but with infinite nesting. Kind like the solution 1c bove.
3. What should be the output of collection values after tag_info() (and thus pretty_tree())?
I suspect we all agree on TAG_Long('created-on'): 1264099775885
for TAG_Long. However
for collection values in TAG_List, TAG_Compound, TAG_Byte_Array and TAG_Int_Array it is less
obvious.
a. without [] and {}, inconsistent names (previous solution, exactly as in original NBT.txt specification)
TAG_Byte_Array('byteArrayTest'): [1000 bytes] TAG_List("listTest (long)"): 5 entries of type TAG_Long TAG_Compound("Level"): 11 Entries
b. with [] and {}, consistent names (current solution)
TAG_Byte_Array('byteArrayTest'): [1000 byte(s)] TAG_List('listTest (long)'): [5 TAG_Long(s)] TAG_Compound('Level'): {11 Entries}
c. without [] and {}, consistent names
TAG_Byte_Array('byteArrayTest'): 1000 byte(s) TAG_List('listTest (long)'): 5 TAG_Long(s) TAG_Compound('Level'): 11 Entries
d. with [] and {}, inconsistent names
TAG_Byte_Array('byteArrayTest'): [1000 bytes] TAG_List("listTest (long)"): [5 entries of type TAG_Long] TAG_Compound("Level"): {11 Entries}
4. What should be the output of string values in tag_info() (and thus pretty_tree())?
a. TAG_String("stringTest"): HELLO WORLD THIS IS A TEST STRING ÅÄÖ!
(current implementation)
b. TAG_String("stringTest"): u'HELLO WORLD THIS IS A TEST STRING ÅÄÖ!'
(more clear what it is)
5. What should be the repr()
string for a NBTFile object?
NBTFile is a subclass of a TAG_Compound, and instances are presented as if they where
TAG_Compounds. This may go against Python guidelines for __repr__()
.
(I personally don't mind the current solution, but a change is fine to, since I probably
use str()
instead of repr()
, and str()
will continue to behave as TAG_Compound.)
a. <TAG_Compound('Level') at 0x10e9a5fd0>
(current solution)
b. <NBTFile('tests/bigtest.nbt') at 0x10e9a5fd0>
c. NBTFile('tests/bigtest.nbt')
6. How should str() deal with non-ascii characters in Python 2?
str(nbt.NBTFile("tests/bigtest.nbt"))
may yield a UnicodeEncodeError, if a TAG_String contains non-ascii characters, such as in the example. Python 3 handles this gracefully, but Python 2 does not. This mimics exisiting behaviour. In Python 2, str(u'¿whåt?')
also raises a UnicodeEncodeError.
a. return str(self.value)
(current solution, mimics Python behaviour, but may raise UnicodeEncodeError)
b. return unicode(self.value)
(Python 3 solution, but may not be what users expect from str() in Python 2)
c. return self.value.encode('utf-8')
(makes assumptions about encoding, which may be incorrect)
d. return self.value.encode(encoding)
with encoding
based on sys.stdout.encoding
, locale.getpreferredencoding()
, sys.getdefaultencoding()
or some other magic (mimics print function)
The PyPI package at http://pypi.python.org/pypi/NBT/ is at version 1.1, and the most recent git tag indicates that version 1.3 is available. Is this intentional? Is 1.3 stable enough for distribution?
A relative recent addition to NBT is world.py with the WorldFolder class. The expected use is for tools that iterate through all Chunks, without caring about the specific Region file.
A common complaint I hear is that NBT is slow. One way to speed things up is to process each region file using a different subprocess and combine the results (this would be a Map-Reduce pattern). The best way to implement this is using a callback function.
E.g.:
def count_blocks(chunk):
"""Given a chunk, return the number of block IDs in this chunk"""
chunk_block_count = [0]*256 # array of 256 integers, one for each block ID
for block_id in chunk.get_all_blocks():
chunk_block_count[block_id] += 1
return chunk_block_count
def summarize_blocks(chunk_block_counts):
"""Given multiple chunk_block_count arrays, add them together."""
total_block_count = [0]*256 # array of 256 integers, one for each block ID
for chunk_block_count in chunk_block_counts:
for block_id in range(256):
total_block_count[block_id] += chunk_block_count[block_id]
return total_block_count
world = WorldFolder(myfolder)
block_count = world.chunk_mapreduce(count_blocks, summarize_blocks)
However, I fear that the term "mapreduce" is not well know with all programmers, and I'm looking for an easier name. Would the following be easier to understand?
world = WorldFolder(myfolder)
chunk_block_counts = world.process_chunks(count_blocks)
block_count = summarize_blocks(chunk_block_counts)
The advantage is that the parallelisation can happen behind the scenes (though the multiprocessing.Pool class already makes it very easy).
The disadvantage is that it adds a third method to the existing get_chunks
and iter_chunks
methods in the WorldFolder class. In addition, there probably also need a process_nbt
and process_regions
next to process_chunks
.
In retrospect, the difference between get_chunks
(which returns a list) and iter_chunks
(which returns an iterator) is so minor (iterators consume less memory, but lists can be cached) that it did not warrant the double function.
I'm inclined to remove the cached get_chunks
(though I liked the name better than iter_chunks
).
Any opinions?
The Sample World at https://github.com/twoolie/NBT/downloads does not contain Biome data.
The world was a regular McRegion world, converted to Anvil with Mojang's converter.
It turns out that this convertor does not add biome data. An alternative method is to fire up the Minecraft client to do the conversion, but that will move the mobs before it can be closed, which I consider a disadvantage (if both worlds are equal, it's easier to compare the McRegion and Anvil parser in test scripts). Opening the file in a Minecraft server may not move the mobs if the /stop command is given immediately, but will always generate a 380x380 area around spawn, further increasing the file size.
If you have any advice on adding biome data without changing the rest of the region/chunk data, please post here.
Currently, most test are done on "Sample World" which is an McRegion file.
Ideally, we should have a "McRegion Sample World" (this one), a "Anvil Sample World" and a "Flattened Sample World", and run the test scripts -were applicable- on all worlds.
Also, we need to check if all example scripts are tested.
I'm not sure how to convert the Sample World in a way that does not change it other than syntax conversion (e.g. prevent the client from generating new chunks, or moving entities.). And if we can not convert it that way, how to update the test suites.
@twoolie Just letting you know I accidentally uploaded a bunch of old branches. I deleted them again, but this is why you probably saw a lot of travis error reports. My apologies.
I think either the mca file format or the chunk format changed, any chance we can get nbt working again with 1.13?
Similar to #64, the delitem in TAG_Int_Array has an extra parameter. This causes Exceptions with pop() and remove().
Hey, It's Omeganx again, I'm having some troubles with the write_file() method, here is my code:
import nbt
from nbt.nbt import *
xmax, ymin, zmin = -77, 114, 252 ##the coordinates of the farm in my current world (for testing and debugging)
xlong, zlong = 40, 32 ##dimension of the iron farm
xmin = xmax-xlong
ts = 14694294 ## the last time a villager was near(just a random value)
villages = NBTFile() ##see the villager.dat file, this part try to redo it with the right coordinates (restacking the vilages, if not yet. the coordinates can be changed later)
data = TAG_Compound()
data.name = "data"
data.tags.extend([
TAG_Int(name="Tick", value = 100000)
])
Villages = TAG_List(type=TAG_Compound)
for z in range(32):
compound = TAG_Compound()
liste = TAG_List(type=TAG_Compound, name="Doors")
totalx, totaly, totalz = 0, 0, 0
for x in range(xmin, xmin+11, 1):
door = TAG_Compound()
door.tags.extend([
TAG_Int(name= "X", value = x ),
TAG_Int(name= "Y", value = ymin ),
TAG_Int(name= "Z", value = zmin+z),
TAG_Int(name= "IDX", value = 2),
TAG_Int(name= "IDZ", value = 0),
TAG_Int(name= "TS", value = ts ),
])
totalx+= x
totaly+=ymin
totalz+=zmin+z
liste.tags.append(door)
for x in range(xmax-11, xmax, 1):
door = TAG_Compound()
door.tags.extend([
TAG_Int(name= "X", value = x ),
TAG_Int(name= "Y", value = ymin ),
TAG_Int(name= "Z", value = zmin+z),
TAG_Int(name= "IDX", value = 2),
TAG_Int(name= "IDZ", value = 0),
TAG_Int(name= "TS", value = ts ),
])
totalx+= x
totaly+= ymin
totalz+= zmin+z
liste.tags.append(door)
compound.tags.append(liste)
compound.tags.extend([
TAG_Int(name="Radius", value=32),
TAG_Int(name="Stable", value=6605526),
TAG_Int(name="MTick", value=0),
TAG_Int(name="Golems", value=1),
TAG_Int(name="CX", value=int(totalx/22)),
TAG_Int(name="CY", value=int(totaly/22)),
TAG_Int(name="CZ", value=int(totalz/22)),
TAG_Int(name="ACX", value=totalx),
TAG_Int(name="ACY", value=totaly),
TAG_Int(name="ACZ", value=totalz),
TAG_Int(name="PopSize", value=61),
TAG_Int(name="Tick", value=ts)
])
##Players = TAG_List(type=TAG_End, name="Players")
##compound.tags.append(Players)
Villages.tags.append(compound)
data.tags.append(Villages)
villages.tags.append(data)
print(villages.pretty_tree())
villages.write_file("villages.dat")
Everything seems to work fine untill: villages.write_file("villages.dat")
And Also how do you use "TAG_End" ?
I upgraded our server from 1.6.4 to 1.7.2 yesterday and suddenly one of my scripts to print the death counters stopped working. After looking into it it turns out it suddenly fails to parse the TAG_LIST.
Example of what I did to reproduce below. Please note that the scoreboard in minecraft is in fact filled with some objectives.
>>> from nbt import *
>>> nbtfile = nbt.NBTFile("/home/schoentoon/minecraft/survival/world/data/scoreboard.dat", 'rb')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/NBT-1.3-py2.7.egg/nbt/nbt.py", line 458, in __init__
self.parse_file()
File "/usr/local/lib/python2.7/dist-packages/NBT-1.3-py2.7.egg/nbt/nbt.py", line 475, in parse_file
self._parse_buffer(self.file)
File "/usr/local/lib/python2.7/dist-packages/NBT-1.3-py2.7.egg/nbt/nbt.py", line 345, in _parse_buffer
tag = TAGLIST[type.value](buffer=buffer)
File "/usr/local/lib/python2.7/dist-packages/NBT-1.3-py2.7.egg/nbt/nbt.py", line 333, in __init__
self._parse_buffer(buffer)
File "/usr/local/lib/python2.7/dist-packages/NBT-1.3-py2.7.egg/nbt/nbt.py", line 345, in _parse_buffer
tag = TAGLIST[type.value](buffer=buffer)
File "/usr/local/lib/python2.7/dist-packages/NBT-1.3-py2.7.egg/nbt/nbt.py", line 265, in __init__
raise ValueError("No type specified for list")
ValueError: No type specified for list
[edit code formatting -- MacFreek]
@twoolie I accidentally pushed my branches to your repositories. I deleted them within a minute, and no bad stuff happened. In case you got some odd messages (eg Travis failing for one of these branches): that's why.
For some reason, Travis sometimes hangs on examplestests.MobAnalysisScriptTest
:
See e.g.
https://travis-ci.org/twoolie/NBT/jobs/378978732
https://travis-ci.org/twoolie/NBT/jobs/378978735
https://travis-ci.org/macfreek/NBT/jobs/378973931
testAnvilWorld (examplestests.MobAnalysisScriptTest) ...
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
In other runs, there is no problem, and I can't replicate this on my local machine either.
https://travis-ci.org/twoolie/NBT/jobs/378511443
https://travis-ci.org/twoolie/NBT/jobs/378976542
https://travis-ci.org/twoolie/NBT/jobs/378976537
With a large world world.iter_nbt caused a too many files open exception since it never closes the region files. The simple fix is to add a few lines to close the region files in world.py at line 95 so it looks like this:
def iter_nbt(self):
"""
Return an iterable list of all NBT. Use this function if you only
want to loop through the chunks once, and don't need the block or data arrays.
"""
# TODO: Implement BoundingBox
# TODO: Implement sort order
for region in self.iter_regions():
for c in region.iter_chunks():
yield c
if hasattr(region.file,'fileobj') and region.file.fileobj: # <- added
region.file.fileobj.close() # <- added
region.file.close() # <- added
edit: Updated fix after encountering again in an even larger world.
@stumpylog @twoolie Hey Trenton, I see your active again, which is great. However, some of the changes may introduce some backward incompatibility. So we need to find a balance between fast progress (rapid prototyping) and robustness.
I propose the following:
Regular bug fixes and documentation enhancement should take place in the master branch, and should be ported to the v2.x branch (not the other way around please)
This can work, but requires some important coding hygiene:
To easy things, I've tagged all issues with API changes, and also added a 2.0 milestone.
Some items that affect lots of smaller parts n the code, like documentation changes and fixing of trailing whitespace are tedious with multiple branches, so I recommend to make these type of changes either now or wait till 2.0 is released.
Note that this is also the time to propose code restructuring. I personally think it is in a very good shape, but here are two suggestions in case someone likes to pick it up: clean-up of function names (e.g. a region.get_chunks does not return a chunk or nbt tag, but only the chunk coordinates; most of the function names in chunk.Chunk are still specific for block ids, without data ids.). Some changes to help speed things up (either by changing some API function so faster numpy functions can be used, or addition of caches).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.