Comments (6)
More details:
In package_model.py
(of pyecma376_2), the check is made:
def check_part_name(part_name: str) -> None:
""" Check if `part_name` is a valid OPC part name in URI (not IRI) representation.
:raises ValueError: if it is not.
"""
if not RE_PART_NAME.match(part_name):
raise ValueError("{} is not an URI path with multiple segments (each not empty and not starting with '.') "
"or not starting with '/' or ending wit '/'".format(repr(part_name)))
And RE_PART_NAME
does not allow spaces:
RE_PART_NAME = re.compile(r'^(/[A-Za-z0-9\-\._~%:@!$&\'()*+,;=]*[A-Za-z0-9\-_~%:@!$&\'()*+,;=])+$')
Let's first conclude that \.
and \'
are equivalent to .
and '
, because it's inside a bracket expression and those don't need to be escaped there.
The first bracket expression contains the .
, the second doesn't. Also, the first bracket expression is optional, the second isn't. So, to me, the Regex is checking whether something does not end with a dot, rather than "not starting with '.'" as mentioned in the error message. So, at least this part may be a bug in the pyecma library (either the error message or the Regex).
Other than that, I don't have enough knowledge to judge whether ECMA 376 allows spaces in file names or not.
from basyx-python-sdk.
OPC Specification IMHO is here: https://standards.iso.org/ittf/PubliclyAvailableStandards/c077818_ISO_IEC_29500-2_2021(E).zip
(download link suggested by Wikipedia does not work).
The relevant chapter seems to be 6.2.2 Part Names. And 6.2.2.2 Syntax specifies:
No I18N segments shall end with a dot (“.”) character.
So this is really about ending with a dot.
Furthermore:
[...] where an I18N segment is a Unicode string that
matches the non-terminal isegment-nz and percent-encoding represents a character by the percent
character "%" followed by two hexadecimal digits, as specified in RFC 3986
So it seems to me that the Basyx LIbrar should percent-encode the space to %20.
from basyx-python-sdk.
I agree, perhaps we should already call check_part_name()
within DictSupplementaryFileContainer.add_file()
, and if that raises an exception, percent encode the part name? Or just always percent encode?
from basyx-python-sdk.
Alright, I did some digging:
When writing supplementary files, the AASXWriter
does the following:
basyx-python-sdk/basyx/aas/adapter/aasx.py
Lines 555 to 559 in 0927305
For each supplementary file, it calls OPCPackageReader.open_part()
with the filename supplied by the user, the way the file was added to the SupplementaryFileContainer
. This determines the name the file will have in the AASX file.
Afterwards, it adds the normalized filename to the list supplementary_file_names
, which will later be used to write the relationship files.
Normalization percent-encodes the string and converts it to lower case:
https://github.com/rwth-iat/PyECMA376-2/blob/e7c28ca9c9fbebfae3d3ab9a8abc104aa57f7913/pyecma376_2/package_model.py#L642-L645
Thus, it seems to me, that only the relationship files are expected to contain normalized, percent-encoded URIs, and the filenames are actually not expected to follow this format. If the filenames would be percent-encoded and converted to lower case, it would also be impossible to determine the original filename before packaging, and the assertion at the end of tutorial_aasx.py
would never be true:
Yet, OPCPackageWriter.open_part()
and OPCPackageWriter.create_fragmented_part()
both call check_part_name()
on the passed filename, which validates the filename against the regular expression for part URIs.
So, I had a look at the specification you linked and it doesn't confirm my thoughts (well, partially):
7.3.4 Mapping logical item names to ZIP item names
For each logical item, the process of mapping of logical item names to ZIP item names shall involve the following steps, in order:
- Remove the leading forward slash (“/”) from the logical item name or, in the case of interleaved parts, from each of the logical item names within the complete sequence.
Remove the leading forward slash (“/”) from the logical item name or, parts, from each of the logical item names within the complete sequence.- Percent-encode every item non-ASCII character.
Validating the filenames via check_part_name()
is wrong, as spaces are ASCII characters and thus allowed in filenames.
However, non-ASCII characters still need to be percent-encoded. Thus, the correct way would be to replace the call to check_part_name()
by a call to a function that percent-encodes the passed filename, or at least to change the current validation.
from basyx-python-sdk.
Did I understand correctly, that this is a bug in the check_part_name()
function of the pyecma376_2 library?
from basyx-python-sdk.
I think the check_part_name()
function is correct, what's incorrect is that this function is used to validate filenames for the zip package. IMO, a function to correctly percent-encode filenames for AASX packages should be implemented in pyecma376_2.
pyecma376_2
already has a function to percent-encode and convert to lower-case part names. However, the way I understand it is that this percent-encoding for part-names is different from the percent-encoding for filenames, as it
- converts the entire string to lower-case and
- also encodes characters that would be allowed in filenames, such as spaces.
from basyx-python-sdk.
Related Issues (20)
- Misleading error message in case of non-existent file HOT 1
- Update `Development Status` classifier in `setup.py` HOT 1
- Integer value is not accepted for Double value type HOT 4
- aasx: Add failsafe for writing and reading AASX files HOT 1
- Convert primitive types into string when creating a property of type string HOT 2
- Logo: Change logo to the original version
- AAS with submodel/type: "ExternalReference" not supported HOT 7
- Reading published AAS Submodel Templates from aasx HOT 11
- Unable to add SubmodelElementCollection to SubmodelElementList HOT 9
- Concept Description missing from AAS (AASX-file) when External Reference is used. HOT 5
- Remove schema files from repository HOT 1
- Missing 'aasx-spec' document in AASX HOT 9
- RELATIONSHIP_TYPE_AASX_ORIGIN strings do not match those in the submodel templates from IDTA HOT 2
- model: Allow empty strings in SubmodelElement.value HOT 1
- How to generate unique IDs? HOT 4
- Does a property lose its reference to a concept description when reading an AASX file? HOT 6
- Log a warning if no AAS files are read within an AASX package
- aas:adapter Unable to read aasx file, because of dead links HOT 10
- convert json into aasx fails HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from basyx-python-sdk.