Comments (8)
Sure ... if you make it optional
from ifiltertextreader.
I just released a new package https://www.nuget.org/packages/IFilterTextReader/1.6.1
from ifiltertextreader.
New package works great, thanks!
from ifiltertextreader.
@Sicos1977 and @mguinness the only problem with this is that it's possible to have meta data properties are duplicated e.g.
Names: foo
Names: bar
In this scenario, the dictionary generates a key already exists exception. I'll log a separate issue for this also
from ifiltertextreader.
Out of interest what is the output of filtdump of an example file? I imagine the tags are coming from different sections in the file. Changing the field type to List<KeyValuePair<string, object>>
would work.
from ifiltertextreader.
@mguinness - sorry, I didn't rush back to this - in this case it's the same section, but the 'different sections' is also a problem
CHUNK: ---------------------------------------------------------------
Attribute = {2C443B1E-F1E2-404F-974D-E21FEF8E70AA}\Names
idChunk = 13
BreakType = 2 (Sentence)
Flags (chunkstate) = (Value)
Locale = 2057 (0x809)
IdChunkSource = 13
cwcStartSource = 0
cwcLenSource = 0
VALUE: ---------------------------------------------------------------
Type = 31 (0x1f), VT_LPWSTR
Value = "Test A"
CHUNK: ---------------------------------------------------------------
Attribute = {2C443B1E-F1E2-404F-974D-E21FEF8E70AA}\Names
idChunk = 14
BreakType = 2 (Sentence)
Flags (chunkstate) = (Value)
Locale = 2057 (0x809)
IdChunkSource = 14
cwcStartSource = 0
cwcLenSource = 0
VALUE: ---------------------------------------------------------------
Type = 31 (0x1f), VT_LPWSTR
Value = "Test B"
<rdf:Description rdf:about=""
xmlns:TestSchema="http://test">
<TestSchema:Names>
<rdf:Bag>
<rdf:li>Test A</rdf:li>
<rdf:li>Test B</rdf:li>
</rdf:Bag>
</TestSchema:Names>
</rdf:Description>
Now, whilst we changed to <string, object> - and i'm going to look at this again soon - for some reason, I seem to recall thinking that including the schema into the output would be useful: Pretty sure I found that <string becomes 'Names' - so if a purpose is to allow an application to filter on a specific filter lets say the meta data property output doesn't let you identify the same name from different paths if there is a conflict. So for example, I have
Where we have System.Title, title and Title.
One of them is dc:tittle - the other is TestSchema:Title - and presumably the System.Title is the default document title outside the metadata. This I think is the issue that you were hitting on?
from ifiltertextreader.
Thanks for the reply. The example you cited seems more like an array of names. Can you upload a small example document?
from ifiltertextreader.
@mguinness - it was indeed an array of names - sample image uploaded below: (hopefully github doesn't modify it)
from ifiltertextreader.
Related Issues (20)
- Cannot read text from .xls file HOT 11
- Text extraction hangs when reading .odt file HOT 4
- Index out of bounds reading a pdf document HOT 1
- Can't get the PDF filter to load the IPersistStream in FileLoader.cs HOT 4
- Question of requirements: does not contain a method named 'new' HOT 5
- TextReader not recognixing line breaks in .docx File HOT 4
- Keep file formatting HOT 1
- Open File Reader with MemoryStream HOT 3
- Exception if property with multiple values exists
- Weird text encoding issue with colons and section symbols HOT 1
- Registry DLL issue after upgrading HOT 1
- System.AccessViolationException HOT 19
- Outdated(?) OffFilter.dll on Windows Server 2012 HOT 2
- OffFilt.dll AccessViolationException HOT 11
- ReadToEnd() causes "Destination Array Not Long Enough" for legacy Word files HOT 1
- Missing filter return code? HOT 7
- Version 1.7+ - System.ExecutionEngineException and System.AccessViolationException HOT 16
- Cannot read text from .xls HOT 6
- License question HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ifiltertextreader.