Comments (4)
This is probably the same issue and root-case that i stumbled upon, so i'm hijacking this issue with my analysis :)
I found that this stems from a subnode VCN value of 0 inside a NtfsAttributeListNonResidentAttributeValue
.
In index.rs:177
, the call to subnode_vcn looks like this:
index_allocation: NtfsIndexAllocation { ntfs: Ntfs { cluster_size: 4096, sector_size: 512, size: 106701438976, mft_position: 3221225472, file_record_size: 1024, serial_number: 9370544425783817375, upcase_table: None }, value: AttributeLis
tNonResident(NtfsAttributeListNonResidentAttributeValue { ntfs: Ntfs { cluster_size: 4096, sector_size: 512, size: 106701438976, mft_position: 3221225472, file_record_size: 1024, serial_number: 9370544425783817375, upcase_table: None }, initial_attribute_
list_entries: NtfsAttributeListEntries { attribute_list: Resident([160, 0, 0, 0, 40, 0, 4, 26, 0, 0, 0, 0, 0, 0, 0, 0, 48, 84, 0, 0, 0, 0, 1, 0, 1, 0, 36, 0, 73, 0, 51, 0, 48, 0, 0, 0, 0, 0, 0, 0, 176, 0, 0, 0, 40, 0, 4, 26, 0, 0, 0, 0, 0, 0, 0, 0, 48, 84
, 0, 0, 0, 0, 1, 0, 2, 0, 36, 0, 73, 0, 51, 0, 48, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 40, 0, 4, 26, 0, 0, 0, 0, 0, 0, 0, 0, 48, 84, 0, 0, 0, 0, 1, 0, 3, 0, 36, 0, 68, 0, 83, 0, 67, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 48, 0, 9, 26, 0, 0, 0, 0, 0, 0, 0, 0, 69, 2
2, 0, 0, 0, 0, 1, 0, 12, 0, 36, 0, 84, 0, 88, 0, 70, 0, 95, 0, 68, 0, 65, 0, 84, 0, 65, 0, 0, 0, 0, 0], 3227063552) }, connected_entries: AttributeListConnectedEntries { attribute_list_entries: Some(NtfsAttributeListEntries { attribute_list: Resident([160
, 0, 0, 0, 40, 0, 4, 26, 0, 0, 0, 0, 0, 0, 0, 0, 48, 84, 0, 0, 0, 0, 1, 0, 1, 0, 36, 0, 73, 0, 51, 0, 48, 0, 0, 0, 0, 0, 0, 0, 176, 0, 0, 0, 40, 0, 4, 26, 0, 0, 0, 0, 0, 0, 0, 0, 48, 84, 0, 0, 0, 0, 1, 0, 2, 0, 36, 0, 73, 0, 51, 0, 48, 0, 0, 0, 0, 0, 0, 0
, 0, 1, 0, 0, 40, 0, 4, 26, 0, 0, 0, 0, 0, 0, 0, 0, 48, 84, 0, 0, 0, 0, 1, 0, 3, 0, 36, 0, 68, 0, 83, 0, 67, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 48, 0, 9, 26, 0, 0, 0, 0, 0, 0, 0, 0, 69, 22, 0, 0, 0, 0, 1, 0, 12, 0, 36, 0, 84, 0, 88, 0, 70, 0, 95, 0, 68, 0,
65, 0, 84, 0, 65, 0, 0, 0, 0, 0], 3227063552) }), instance: 1, ty: IndexAllocation }, data_size: 10747904, attribute_state: None, stream_state: StreamState { stream_data_run: None, stream_position: 0, data_size: 10747904 } }) },
index_record_size: 4096,
vcn: 0
The issue, afaict, is that the seek implementation for NtfsAttributeListNonResidentAttributeValue
does not handle a vcn value of 0. It expects data_to_seek
to be strictly positive, but do not handle it being 0 and walking inside the next data run of the current attribute, or the first data run of the next connect attribute.
I don't know enough about NTFS whether this is a valid case or if some parsing failed earlier, but it looks legit from what i've seen, and changing the check on data_to_seek
into a loop seems to fix the issue.
I have attached a tentative PR for this: #11
from ntfs.
You're right, I just tested #11 and it does fix it the issue for me too.
from ntfs.
@leofidus @vthib Thanks a lot for your bug report and the analysis!
I dug into the related code again and I'm afraid this requires more than a 1-line fix to handle all such cases:
NtfsNonResidentAttributeValue
and NtfsAttributeListNonResidentAttributeValue
both share code in StreamState
to read and seek in Data Runs.
When StreamState
is created via StreamState::new
, its stream_data_run
field is initialized to None
. The field is only filled with an actual Data Run when the first non-zero read or seek is performed. That way, StreamState::new
remains simple and we only deal with Data Runs when actually required. There are also other cases when stream_data_run
may be None
:
- We seek past the end of all Data Runs
- We encounter a non-resident attribute with zero Data Runs (may not be valid as per NTFS spec, but the crate must not panic here)
The design above gets problematic as soon as I want to return the absolute filesystem byte position of the current seek position. I only have a valid value for that if stream_data_run
is filled with a Some
value.
Additionally, the Data Run itself can only know about its absolute seek position if we haven't seeked past its bounds and it's not a sparse Data Run. This is why all data_position
functions return an Option<u64>
instead of a simple definite u64
.
Now I have multiple other parts of the code, which urgently need to know the absolute filesystem byte position, and can't just deal with an Option
.
They all use .data_position().unwrap()
and are supposed to previously check the conditions, so that .unwrap()
cannot panic. That was thought to be easy: As that code also performs the seeks, it can make sure that it seeks within the bounds. It can also check for sparse Data Runs.
However, I must have missed that the data_position
function of NtfsNonResidentAttributeValue
and NtfsAttributeListNonResidentAttributeValue
may also return None
when their stream_data_run
is None
. The function-level comment doesn't mention that possibility.
Actually, I didn't entirely miss it. NtfsNonResidentAttributeValue::new
has extra code to mitigate that:
ntfs/src/attribute_value/non_resident.rs
Lines 48 to 52 in be1fc19
This mitigation fails though when you seek to SeekFrom::Start(0)
.
NtfsAttributeListNonResidentAttributeValue
misses any such workarounds, which is why you're encountering this bug in the first place.
PR #11 mitigates this bug, but only if you seek before calling data_position
. It panics the same way if you don't seek at all.
None of the mitigations work when a non-resident attribute has no Data Runs at all.
I need some more time to think this through and come up with a definite solution that works in all cases.
- Perhaps I can actually get the callers of
data_position
to handle anOption
instead of unwrapping it. They usually don't work on it themselves, but pass it through to other functions. Could require quite some refactoring. - Maybe some of them don't even need the absolute byte position at all. I mostly use it to provide informative error messages.
- I might also be able to just add some missing checks before unwrapping, and can thereby ensure that the
Option
is aSome
value.
This may take some time. In any case, thanks for uncovering this bug and the related #9!
from ntfs.
I think #9 is a good example of how the error types should have position as an Option<u64>
. Then the code would generate a error message
NtfsError::InvalidStructuredValueSize {
position: None,
ty: NtfsAttributeType::VolumeName,
expected: 1,
actual: 0,
}
which is completely reasonable (why would there be a data run if there is no data). From a cursory look changing the error types to an option and passing the option along as long as possible would also get rid of a lot of unwraps. It wouldn't solve #9 or #10 on its own, but I think it's a reasonable starting point.
from ntfs.
Related Issues (18)
- Multiply with oveflow when parsing malformed file system
- 2MB clusters lead to "The cluster size is 124928 bytes, but the maximum supported one is 2097152" HOT 3
- Large Sector Sizes HOT 1
- UEFI support HOT 2
- Publish a `0.2` or `0.1.x` version HOT 1
- NtfsFile::data string comparison is not case insensitive HOT 4
- Slice index out of bounds when parsing upcase table for malformed FS HOT 1
- Panic with malformed ntfs in non_resident_value_data_and_position HOT 4
- Infinite loop on NtfsAttributes custom iterator next HOT 3
- Crash in `Record::fixup`, `array_position_end` out of bounds of `self.data` HOT 1
- Crash on `Record::fixup` `sector_position_end` out of bounds of data length and HOT 3
- Crash on `Record::update_sequence_array_count`, substraction overflow HOT 1
- Is it safe to scan a live disk? HOT 2
- Example is outdated
- NTFS file at byte position 0xb820fe000 has no attribute HOT 5
- PhysicalDrive0 vs C: HOT 2
- Panic on empty volume name HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ntfs.