Coder Social home page Coder Social logo

fbchat-archive-parser's People

Contributors

artempal avatar fohlin avatar githubuser158742 avatar gpollo avatar jurf avatar matnguyen avatar null665 avatar phpxp avatar ptalmeida avatar seanny123 avatar virenmohindra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fbchat-archive-parser's Issues

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 7

When I attempt to extract just one conversation from my messages archive, it dies with:

$ fbcap -t 'Elizabeth' messages.htm > us.txt
Traceback (most recent call last):
  File "/usr/local/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.4.post2', 'console_scripts', 'fbcap')()
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/main.py", line 66, in main
  File "/Library/Python/2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/Library/Python/2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/Library/Python/2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/main.py", line 31, in fbcap
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/writers/__init__.py", line 22, in write
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/writers/writer.py", line 14, in write
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/writers/text.py", line 29, in write_history
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/writers/text.py", line 42, in write_thread
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/writers/text.py", line 53, in write_message
  File "/Library/Python/2.7/site-packages/colorama/ansitowin32.py", line 36, in write
    self.__convertor.write(text)
  File "/Library/Python/2.7/site-packages/colorama/ansitowin32.py", line 137, in write
    self.write_and_convert(text)
  File "/Library/Python/2.7/site-packages/colorama/ansitowin32.py", line 165, in write_and_convert
    self.write_plain_text(text, cursor, len(text))
  File "/Library/Python/2.7/site-packages/colorama/ansitowin32.py", line 170, in write_plain_text
    self.wrapped.write(text[start:end])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 7: ordinal not in range(128)

I'm trying to convert the file to ascii with uni2ascii now to see if I can get around this.

Issue with ValueError time data does not match format

``C:\Users\*REDACTED*\Documents\Powershell\messages\all>fbcap ./messages.htm
Discovered chat thread with [*REDACTED*, *REDACTED*]...Traceback (most recent call last):
  File "C:\Users\*REDACTED*\AppData\Local\Programs\Python\Python35-32\Scripts\fbcap-script.py", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.7.post2', 'console_scripts', 'fbcap')()
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line 137, in main
    app.run()
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line 87, in fbcap
    raise e
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line 70, in fbcap
    fbch = parse_data(parser_call)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line 92, in parse_data
    return parser_call()
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", line 124, in __init__
    self._parse_content(bs4)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", line 146, in _parse_content
    self._process_element(pos, element)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", line 325, in _process_element
    self._parse_time(e)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", line 242, in _parse_time
    timestamp = datetime.strptime(timestamp, self._DATE_FORMAT)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\_strptime.py", line 510, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "c:\users\*REDACTED*\appdata\local\programs\python\python35-32\lib\_strptime.py", line 343, in _strptime
    (data_string, format))
ValueError: time data 'Saturday, 9 July 2016 at 21:50' does not match format '%A, %B %d, %Y at %I:%M%p'

xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 5863, column 12969

Using the newest version. I'm new to python so I'm not sure what this means.

Thank you!

File "//anaconda/bin/fbcap", line 11, in
sys.exit(main())
File "//anaconda/lib/python3.5/site-packages/fbchat_archive_parser/main.py", line 66, in main
app.run()
File "//anaconda/lib/python3.5/site-packages/clip.py", line 652, in run
self.invoke(self.parse(tokens))
File "//anaconda/lib/python3.5/site-packages/clip.py", line 634, in invoke
self._main.invoke(parsed)
File "//anaconda/lib/python3.5/site-packages/clip.py", line 519, in invoke
self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
File "//anaconda/lib/python3.5/site-packages/fbchat_archive_parser/main.py", line 27, in fbcap
progress_output=sys.stdout.isatty())
File "//anaconda/lib/python3.5/site-packages/fbchat_archive_parser/parser.py", line 98, in init
self.parse_content()
File "//anaconda/lib/python3.5/site-packages/fbchat_archive_parser/parser.py", line 107, in __parse_content
for pos, element in ET.iterparse(self.stream, events=("start", "end")):
File "//anaconda/lib/python3.5/xml/etree/ElementTree.py", line 1289, in __next

for event in self._parser.read_events():
File "//anaconda/lib/python3.5/xml/etree/ElementTree.py", line 1272, in read_events
raise event
File "//anaconda/lib/python3.5/xml/etree/ElementTree.py", line 1230, in feed
self._parser.feed(data)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 5863, column 12969

Only shows Facebook id

It seems like Facebook has changed something in their system, as I cannot use the thread option. It only shows @facebook.com addresses now. An older file I have works just fine.

Type Error

Hello, I got this error while parsing

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 63, in fbcap
fbch = parse_data(parser_call)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 85, in parse_data
return parser_call()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/parser.py", line 124, in init
self._parse_content(bs4)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/parser.py", line 145, in _parse_content
parser=XMLParser(encoding=str('UTF-8'))):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py", line 1294, in next
for event in self._parser.read_events():
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py", line 1277, in read_events
raise event
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py", line 1235, in feed
self._parser.feed(data)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 2151, column 22906

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/bin/fbcap", line 9, in
load_entry_point('fbchat-archive-parser==0.7', 'console_scripts', 'fbcap')()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 131, in main
app.run()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/clip.py", line 652, in run
self.invoke(self.parse(tokens))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/clip.py", line 634, in invoke
self._main.invoke(parsed)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/clip.py", line 519, in invoke
self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 80, in fbcap
raise e
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 68, in fbcap
fbch = parse_data(parser_call(bs4=True))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 85, in parse_data
return parser_call()
TypeError: 'FacebookChatHistory' object is not callable

fbcap: command not found

I installed pip and ran pip install fbchat-archive-parser. Running fbcap ./messages.htm results in

No command 'fbcap' found, did you mean:
 Command 'fbcat' from package 'fbcat' (universe)
fbcap: command not found

Running on Ubuntu 16.04 via the Windows Subsystem for Linux.

AmbiguousTimeError while trying to run fbcap

Not much to say about the issue, below there is dump of error message. I'm in Poland timezone.

alvinek@linux-1t6h:~/Downloads/facebook-x/html> fbcap ./messages.html
Traceback (most recent call last):
  File "/usr/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.8.post21', 'console_scripts', 'fbcap')()
  File "/usr/lib/python3.4/site-packages/pkg_resources/__init__.py", line 558, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python3.4/site-packages/pkg_resources/__init__.py", line 2682, in load_entry_point
    return ep.load()
  File "/usr/lib/python3.4/site-packages/pkg_resources/__init__.py", line 2355, in load
    return self.resolve()
  File "/usr/lib/python3.4/site-packages/pkg_resources/__init__.py", line 2361, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/usr/lib/python3.4/site-packages/fbchat_archive_parser/main.py", line 11, in <module>
    from .parser import MessageHtmlParser
  File "/usr/lib/python3.4/site-packages/fbchat_archive_parser/parser.py", line 13, in <module>
    from .time import parse_timestamp
  File "/usr/lib/python3.4/site-packages/fbchat_archive_parser/time.py", line 43, in <module>
    tz = pytz_timezone(tz_name).localize(datetime.now() + dt_timedelta(days=d), is_dst=None)
  File "/usr/lib/python3.4/site-packages/pytz/tzinfo.py", line 349, in localize
    raise AmbiguousTimeError(dt)
pytz.exceptions.AmbiguousTimeError: 2017-11-05 00:27:21.444891

Unexpected time format

Hey,

My facebook language settings are in EN-US as you suggested.. The archive was exported under those settings... Still getting this error however.. what do you think ?

Unexpected time format in "Thursday, September 22, 2011 at 3:09pm EDT". This program only accepts English locale time formatting. 
If you downloaded your Facebook data in a different language, please temporarily switch your language settings to English (US), re-download, and try again. 
If that doesn't help, then please report this as a bug on the associated GitHub page. 

Thank you for your time !

Silent failure when parsing

It's worked previously, but now does not (I think there is a new facebook archive format as of the last few weeks, it worked fine ~1 month ago). Running
fbcap ./html/messages.htm
yields:

----------------------------------------er]...
 Conversation history of Will Strimling
----------------------------------------

   There's nothing here!

Blocking - ValueError: time data does not match format

This tool looks really useful, but right now it does not work at all for me - see below. Is this possible to fix? Thank you for your effort!

...Traceback (most recent call last):
File "/usr/local/bin/fbcap", line 11, in
sys.exit(main())
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/main.py", line 137, in main
app.run()
File "/usr/local/lib/python3.5/dist-packages/clip.py", line 652, in run
self.invoke(self.parse(tokens))
File "/usr/local/lib/python3.5/dist-packages/clip.py", line 634, in invoke
self._main.invoke(parsed)
File "/usr/local/lib/python3.5/dist-packages/clip.py", line 519, in invoke
self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/main.py", line 87, in fbcap
raise e
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/main.py", line 70, in fbcap
fbch = parse_data(parser_call)
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/main.py", line 92, in parse_data
return parser_call()
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 124, in init
self._parse_content(bs4)
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 146, in _parse_content
self._process_element(pos, element)
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 325, in _process_element
self._parse_time(e)
File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 242, in _parse_time
timestamp = datetime.strptime(timestamp, self._DATE_FORMAT)
File "/usr/lib/python3.5/_strptime.py", line 510, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "/usr/lib/python3.5/_strptime.py", line 343, in _strptime
(data_string, format))
ValueError: time data '19. september 2014 kl. 16:00' does not match format '%A, %B %d, %Y at %I:%M%p'

Error at pip install

Hey guys,

When trying to pip install this from cloud9 IDE I get the following errors:

$ pip install fbchat-archive-parser
Collecting fbchat-archive-parser
Downloading fbchat_archive_parser-1.2.tar.gz
Collecting arrow==0.9.0 (from fbchat-archive-parser)
Downloading arrow-0.9.0.tar.gz (86kB)
100% |████████████████████████████████| 92kB 3.4MB/s
Collecting babel==2.4.0 (from fbchat-archive-parser)
Downloading Babel-2.4.0-py2.py3-none-any.whl (6.8MB)
100% |████████████████████████████████| 6.8MB 213kB/s
Collecting beautifulsoup4==4.5.3 (from fbchat-archive-parser)
Downloading beautifulsoup4-4.5.3-py3-none-any.whl (85kB)
100% |████████████████████████████████| 92kB 9.0MB/s
Requirement already satisfied: bs4==0.0.1 in /opt/pyenv/versions/3.6.0/lib/python3.6/site-packages (from fbchat-archive-parser)
Requirement already satisfied: click==6.7 in /opt/pyenv/versions/3.6.0/lib/python3.6/site-packages (from fbchat-archive-parser)
Collecting colorama==0.3.7 (from fbchat-archive-parser)
Downloading colorama-0.3.7-py2.py3-none-any.whl
Collecting python-dateutil==2.6.0 (from fbchat-archive-parser)
Downloading python_dateutil-2.6.0-py2.py3-none-any.whl (194kB)
100% |████████████████████████████████| 194kB 4.5MB/s
Collecting pytz==2016.7 (from fbchat-archive-parser)
Downloading pytz-2016.7-py2.py3-none-any.whl (480kB)
100% |████████████████████████████████| 481kB 2.3MB/s
Collecting pyyaml==3.12 (from fbchat-archive-parser)
Downloading PyYAML-3.12.tar.gz (253kB)
100% |████████████████████████████████| 256kB 4.0MB/s
Collecting requests==2.13.0 (from fbchat-archive-parser)
Downloading requests-2.13.0-py2.py3-none-any.whl (584kB)
100% |████████████████████████████████| 593kB 2.1MB/s
Requirement already satisfied: six==1.10.0 in /opt/pyenv/versions/3.6.0/lib/python3.6/site-packages (from fbchat-archive-parser)
Installing collected packages: python-dateutil, arrow, pytz, babel, beautifulsoup4, colorama, pyyaml, requests, fbchat-archive-parser
Running setup.py install for arrow ... done
Found existing installation: pytz 2017.2
Uninstalling pytz-2017.2:
Successfully uninstalled pytz-2017.2
Found existing installation: beautifulsoup4 4.6.0
Uninstalling beautifulsoup4-4.6.0:
Successfully uninstalled beautifulsoup4-4.6.0
Running setup.py install for pyyaml ... error
Complete output from command /opt/pyenv/versions/3.6.0/bin/python3.6 -u -c "import setuptools, tokenize;file='/tmp/pip-build-kgluw1v1/pyyaml/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-xrauev9m-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/emitter.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/tokens.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/loader.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/dumper.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/cyaml.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/serializer.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/error.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/composer.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/parser.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/scanner.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/representer.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/init.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/events.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/reader.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/nodes.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/constructor.py -> build/lib.linux-x86_64-3.6/yaml
copying lib3/yaml/resolver.py -> build/lib.linux-x86_64-3.6/yaml
warning: build_py: byte-compiling is disabled, skipping.

running build_ext
creating build/temp.linux-x86_64-3.6
checking if libyaml is compilable
clang -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fsanitize=signed-integer-overflow -fsanitize=undefined -ggdb3 -O0 -std=c11 -Wall -Werror -Wextra -Wno-sign-compare -Wshadow -fPIC -I/opt/pyenv/versions/3.6.0/include/python3.6m -c build/temp.linux-x86_64-3.6/check_libyaml.c -o build/temp.linux-x86_64-3.6/check_libyaml.o
checking if libyaml is linkable
clang build/temp.linux-x86_64-3.6/check_libyaml.o -lyaml -o build/temp.linux-x86_64-3.6/check_libyaml
building '_yaml' extension
creating build/temp.linux-x86_64-3.6/ext
clang -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fsanitize=signed-integer-overflow -fsanitize=undefined -ggdb3 -O0 -std=c11 -Wall -Werror -Wextra -Wno-sign-compare -Wshadow -fPIC -I/opt/pyenv/versions/3.6.0/include/python3.6m -c ext/_yaml.c -o build/temp.linux-x86_64-3.6/ext/_yaml.o
In file included from ext/_yaml.c:271:
ext/_yaml.h:10:9: error: 'PyString_CheckExact' macro redefined [-Werror,-Wmacro-redefined]
#define PyString_CheckExact PyBytes_CheckExact
        ^
ext/_yaml.c:139:11: note: previous definition is here
  #define PyString_CheckExact          PyUnicode_CheckExact
          ^
ext/_yaml.c:1410:17: error: assigning to 'char *' from 'const char *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
  __pyx_v_value = yaml_get_version_string();
                ^ ~~~~~~~~~~~~~~~~~~~~~~~~~
ext/_yaml.c:2577:52: error: incompatible pointer types passing 'int (void *, char *, size_t, size_t *)' (aka 'int (void *, char *, unsigned long, unsigned long *)') to parameter of type 'yaml_read_handler_t *' (aka 'int (*)(void *, unsigned char *, unsigned long, unsigned long *)') [-Werror,-Wincompatible-pointer-types]
    yaml_parser_set_input((&__pyx_v_self->parser), __pyx_f_5_yaml_input_handler, ((void *)__pyx_v_self));
                                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/yaml.h:1368:30: note: passing argument to parameter 'handler' here
        yaml_read_handler_t *handler, void *data);
                             ^
ext/_yaml.c:2818:59: error: passing 'char *' to parameter of type 'const unsigned char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    yaml_parser_set_input_string((&__pyx_v_self->parser), PyString_AS_STRING(__pyx_v_stream), PyString_GET_SIZE(__pyx_v_stream));
                                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ext/_yaml.h:11:29: note: expanded from macro 'PyString_AS_STRING'
#define PyString_AS_STRING  PyBytes_AS_STRING
                            ^
/opt/pyenv/versions/3.6.0/include/python3.6m/bytesobject.h:85:31: note: expanded from macro 'PyBytes_AS_STRING'
#define PyBytes_AS_STRING(op) (assert(PyBytes_Check(op)), \
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/yaml.h:1342:30: note: passing argument to parameter 'input' here
        const unsigned char *input, size_t size);
                             ^
ext/_yaml.c:4572:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_FromString(__pyx_v_token->data.tag_directive.handle); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 417, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:4584:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_FromString(__pyx_v_token->data.tag_directive.prefix); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 418, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:5444:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_FromString(__pyx_v_token->data.alias.value); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 448, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:5518:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_FromString(__pyx_v_token->data.anchor.value); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 451, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:5592:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_FromString(__pyx_v_token->data.tag.handle); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 454, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:5604:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_FromString(__pyx_v_token->data.tag.suffix); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 455, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:5716:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_2 = PyUnicode_DecodeUTF8(__pyx_v_token->data.scalar.value, __pyx_v_token->data.scalar.length, ((char *)"strict")); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 460, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:1287:17: note: passing argument to parameter 'string' here
    const char *string,         /* UTF-8 encoded string */
                ^
ext/_yaml.c:7424:42: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
        __pyx_t_4 = PyUnicode_FromString(__pyx_v_tag_directive->handle); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 574, __pyx_L1_error)
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:7436:42: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
        __pyx_t_4 = PyUnicode_FromString(__pyx_v_tag_directive->prefix); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 575, __pyx_L1_error)
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:7655:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_4 = PyUnicode_FromString(__pyx_v_event->data.alias.anchor); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 586, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:7749:40: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
      __pyx_t_4 = PyUnicode_FromString(__pyx_v_event->data.scalar.anchor); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 591, __pyx_L1_error)
                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:7790:40: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
      __pyx_t_4 = PyUnicode_FromString(__pyx_v_event->data.scalar.tag); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 594, __pyx_L1_error)
                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:7811:38: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
    __pyx_t_4 = PyUnicode_DecodeUTF8(__pyx_v_event->data.scalar.value, __pyx_v_event->data.scalar.length, ((char *)"strict")); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 595, __pyx_L1_error)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:1287:17: note: passing argument to parameter 'string' here
    const char *string,         /* UTF-8 encoded string */
                ^
ext/_yaml.c:8179:40: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
      __pyx_t_4 = PyUnicode_FromString(__pyx_v_event->data.sequence_start.anchor); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 620, __pyx_L1_error)
                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
ext/_yaml.c:8220:40: error: passing 'yaml_char_t *' (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Werror,-Wpointer-sign]
      __pyx_t_4 = PyUnicode_FromString(__pyx_v_event->data.sequence_start.tag); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 623, __pyx_L1_error)
                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/pyenv/versions/3.6.0/include/python3.6m/unicodeobject.h:703:17: note: passing argument to parameter 'u' here
    const char *u              /* UTF-8 encoded string */
                ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
error: command 'clang' failed with exit status 1

----------------------------------------

Command "/opt/pyenv/versions/3.6.0/bin/python3.6 -u -c "import setuptools, tokenize;file='/tmp/pip-build-kgluw1v1/pyyaml/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-xrauev9m-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-kgluw1v1/pyyaml/

PT-BR Time Format

Sorry to open an issue so broad, but I wanted to know how could I add the timezone for my language, Portuguese-BR?

I understand that I have to modify the Time.py but after that, how could I make sure the script is pulling the info from that file?

Nevertheless, I couldn't comprehend the format. In time.py we have:

FACEBOOK_TIMESTAMP_FORMATS = [
    ("en_us", "dddd, MMMM D, YYYY [at] h:mmA"),                 # English US (12-hour)
    ("en_us", "dddd, MMMM D, YYYY [at] HH:mm"),                 # English US (24-hour)
    ("en_us", "dddd, D MMMM YYYY [at] HH:mm"),                  # English UK (24-hour)
    ("fr_fr", "dddd D MMMM YYYY, HH:mm"),                       # French (France)
    ("de_de", "dddd, D. MMMM YYYY [um] HH:mm"),                 # German (Germany)
    ("nb_no", "D. MMMM YYYY kl. HH:mm"),                        # Norwegian (Bokmål)
    ("es_es", "dddd, D [de] MMMM [de] YYYY [a las?] H:mm"),     # Spanish (General)
    ("hu_hu", "YYYY. MMMM D., H:mm"),                           # Hungarian
    ("it_it", "dddd D MMMM YYYY [alle ore] H:mm"),              # Italian (Italy)
    
]

and from the test html we have:
<span class="meta">Friday, October 4, 2013 at 10:05pm PDT</span>

I assume that would be English US (24-hour), right?

In portuguese the date looks like that:
<span class="meta">Quinta, 9 de junho de 2016 às 19:53 UTC-03</span>

So...it would be that?
("pt_br", "dddd, D [de] MMMM [de] YYYY [às] HH:mm"), # Portuguese Protype

How to deal with the UTC-03? and add portuguese-br? Thanks!

Stream error

I'm not getting the following error on both Python 2 and 3 on the master branch:

Traceback (most recent call last):                                                            
  File "/usr/local/bin/fbcap", line 11, in <module>
    load_entry_point('fbchat-archive-parser', 'console_scripts', 'fbcap')()
  File "/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/main.py", line 156, in main
    app.run()
  File "/usr/local/lib/python3.5/dist-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/local/lib/python3.5/dist-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/local/lib/python3.5/dist-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/main.py", line 100, in fbcap
    write(format, fbch)
TypeError: write() missing 1 required positional argument: 'stream'

Don't know if you want to make the stream argument optional or what exactly is going wrong here.

AttributeError

Hi, when I try to run the file I get the following error:
Traceback (most recent call last):
File "/usr/local/bin/fbcap", line 9, in
load_entry_point('fbchat-archive-parser==0.6.post1', 'console_scripts', 'fbcap')()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 357, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2394, in load_entry_point
return ep.load()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2108, in load
entry = import(self.module_name, globals(),globals(), ['name'])
File "build/bdist.macosx-10.11-intel/egg/fbchat_archive_parser/main.py", line 15, in
AttributeError: 'module' object has no attribute 'App'

Would anyone be able to help? Thanks

Fail to install fbchat-archive-parser

Hi, newbie here.

I am having a problem in installing this, can anyone feel free to help me and give me some guidance? I really appreciate your time and your help. Thank you

bot@bot-Aspire-V5-471PG:~/Desktop/Facebook-Messenger-Bot$ pip install fbchat-archive-parser
Collecting fbchat-archive-parser
Collecting requests==2.13.0 (from fbchat-archive-parser)
  Using cached requests-2.13.0-py2.py3-none-any.whl
Collecting colorama==0.3.7 (from fbchat-archive-parser)
  Using cached colorama-0.3.7-py2.py3-none-any.whl
Requirement already satisfied: six==1.10.0 in /usr/local/lib/python2.7/dist-packages/six-1.10.0-py2.7.egg (from fbchat-archive-parser)
Collecting beautifulsoup4==4.5.3 (from fbchat-archive-parser)
  Using cached beautifulsoup4-4.5.3-py2-none-any.whl
Collecting pyyaml==3.12 (from fbchat-archive-parser)
Collecting babel==2.4.0 (from fbchat-archive-parser)
  Using cached Babel-2.4.0-py2.py3-none-any.whl
Collecting click==6.7 (from fbchat-archive-parser)
  Using cached click-6.7-py2.py3-none-any.whl
Collecting arrow==0.9.0 (from fbchat-archive-parser)
Collecting python-dateutil==2.6.0 (from fbchat-archive-parser)
  Using cached python_dateutil-2.6.0-py2.py3-none-any.whl
Collecting pytz==2016.7 (from fbchat-archive-parser)
  Using cached pytz-2016.7-py2.py3-none-any.whl
Collecting bs4==0.0.1 (from fbchat-archive-parser)
Installing collected packages: requests, colorama, beautifulsoup4, pyyaml, pytz, babel, click, python-dateutil, arrow, bs4, fbchat-archive-parser
Exception:
Traceback (most recent call last):
  File "/home/bot/.local/lib/python2.7/site-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/home/bot/.local/lib/python2.7/site-packages/pip/commands/install.py", line 342, in run
    prefix=options.prefix_path,
  File "/home/bot/.local/lib/python2.7/site-packages/pip/req/req_set.py", line 784, in install
    **kwargs
  File "/home/bot/.local/lib/python2.7/site-packages/pip/req/req_install.py", line 851, in install
    self.move_wheel_files(self.source_dir, root=root, prefix=prefix)
  File "/home/bot/.local/lib/python2.7/site-packages/pip/req/req_install.py", line 1064, in move_wheel_files
    isolated=self.isolated,
  File "/home/bot/.local/lib/python2.7/site-packages/pip/wheel.py", line 345, in move_wheel_files
    clobber(source, lib_dir, True)
  File "/home/bot/.local/lib/python2.7/site-packages/pip/wheel.py", line 316, in clobber
    ensure_dir(destdir)
  File "/home/bot/.local/lib/python2.7/site-packages/pip/utils/__init__.py", line 83, in ensure_dir
    os.makedirs(path)
  File "/usr/lib/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/requests'

Danish locale

Hi,

Thanks for making this tool!

Just attempted to use this on data downloaded in the Danish locale. I get the following error:

Discovered chat thread with [XX, YY]...
Unexpected time format in "3. december 2015 kl. 10:29 UTC+01".  If you downloaded your 
Facebook data in a language other than English,  then it's possible support may need to
be  added to this tool.

Please report this as a bug on the associated GitHub page and it will be fixed promptly

The date format ("3. december 2015 kl. 10:29 UTC+01") translates to "3rd December 2015 at. 10:29 UTC+01" (as you might've been able to guess).

Please parse dates in different languages

I just came across your useful tool. Thanks! I would like to run it now only on my latest download, but also on archived versions of Facebook zip files, so I cannot change my locale for that,.

It would be great if the tool could parse dates in any locale, or in a given one.

Not working with new archive change?

As mentioned in the readme, the file messages.htm no longer has all the messages for everyone in one file, but gives reference to where all those messages are in the /messages directory. The read me makes no indication of weather the program has been modified to reflect this change or if it changes the way I should use it.

I cannot seem to get the program to work with this change.

Running the program with messages.htm gives:

fbcap ./messages.htm
/usr/lib/python2.7/dist-packages/cryptography/hazmat/backends/init.py:7: UserWarning: Module six was already imported from /home/castro/.local/lib/python2.7/site-packages/six.pyc, but /usr/lib/python2.7/dist-packages is being added to sys.path
import pkg_resources
/usr/lib/python2.7/dist-packages/cryptography/hazmat/backends/init.py:7: UserWarning: Module requests was already imported from /usr/local/lib/python2.7/dist-packages/requests/init.pyc, but /usr/lib/python2.7/dist-packages is being added to sys.path
import pkg_resources
Traceback (most recent call last):
File "/usr/local/bin/fbcap", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/main.py", line 173, in main
app.run()
File "/usr/local/lib/python2.7/dist-packages/clip.py", line 652, in run
self.invoke(self.parse(tokens))
File "/usr/local/lib/python2.7/dist-packages/clip.py", line 634, in invoke
self._main.invoke(parsed)
File "/usr/local/lib/python2.7/dist-packages/clip.py", line 519, in invoke
self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/main.py", line 113, in fbcap
fbch = parser.parse()
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 102, in parse
self._parse_content()
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 128, in _parse_content
self._process_element(pos, element)
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 262, in _process_element
self.current_sender))
Exception: Data missing from message. This is a parsingerror: None, None

If I grep to lookup the conversation/file that interests me, and then use that one as my input I get a different error message:

castro@ezri:~/Desktop/fb/messages$ fbcap 10152874501431571.html
/usr/lib/python2.7/dist-packages/cryptography/hazmat/backends/init.py:7: UserWarning: Module six was already imported from /home/castro/.local/lib/python2.7/site-packages/six.pyc, but /usr/lib/python2.7/dist-packages is being added to sys.path
import pkg_resources
/usr/lib/python2.7/dist-packages/cryptography/hazmat/backends/init.py:7: UserWarning: Module requests was already imported from /usr/local/lib/python2.7/dist-packages/requests/init.pyc, but /usr/lib/python2.7/dist-packages is being added to sys.path
import pkg_resources
Skipping chat thread with unknown participants...Traceback (most recent call last):
File "/usr/local/bin/fbcap", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/main.py", line 173, in main
app.run()
File "/usr/local/lib/python2.7/dist-packages/clip.py", line 652, in run
self.invoke(self.parse(tokens))
File "/usr/local/lib/python2.7/dist-packages/clip.py", line 634, in invoke
self._main.invoke(parsed)
File "/usr/local/lib/python2.7/dist-packages/clip.py", line 519, in invoke
self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/main.py", line 113, in fbcap
fbch = parser.parse()
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 102, in parse
self._parse_content()
File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 127, in _parse_content
parser=parser):
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1271, in next
raise e
xml.etree.ElementTree.ParseError: undefined entity: line 111, column 149

AttributeError: FacebookChatHistory instance has no attribute 'chat_threads'

Printing the chat as text works, but running stats fails:

Traceback (most recent call last):
  File "/home/jojo/.bin/fbcap", line 11, in <module>
    sys.exit(main())
  File "/home/jojo/.local/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 154, in main
    app.run()
  File "/home/jojo/.local/lib/python2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/home/jojo/.local/lib/python2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/home/jojo/.local/lib/python2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/home/jojo/.local/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 96, in fbcap
    generate_stats(fbch, sys.stdout)
  File "/home/jojo/.local/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 134, in generate_stats
    threads = tuple(fbch.chat_threads[k] for k in fbch.chat_threads.keys())
AttributeError: FacebookChatHistory instance has no attribute 'chat_threads'

Datetime error

In both Python 2 and Python 3 master branch of this repository, I'm getting the following error:

  File "/usr/local/bin/fbcap", line 11, in <module>
    load_entry_point('fbchat-archive-parser==0.8.post4', 'console_scripts', 'fbcap')()
  File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/main.py", line 153, in main
    app.run()
  File "/usr/local/lib/python3.5/dist-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/local/lib/python3.5/dist-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/local/lib/python3.5/dist-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/main.py", line 94, in fbcap
    fbch = parser.parse()
  File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 88, in parse
    self._parse_content()
  File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 114, in _parse_content
    self._process_element(pos, element)
  File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/parser.py", line 237, in _process_element
    parse_timestamp(e.text, self.use_utc, self.timezone_hints)
  File "/usr/local/lib/python3.5/dist-packages/fbchat_archive_parser/time.py", line 165, in parse_timestamp
    timestamp = timestamp.datetime
AttributeError: 'datetime.datetime' object has no attribute 'datetime'

Parsing facebook IDs

I recently downloaded and went through my data. It seems like at some point in November 2016, the "Sender" field stops displaying a readable name and instead displays "[email protected]". I'm sure it's possible to map these values back to their real names based on the threads?

UPDATE: Oops, never mind. I just found the --resolve flag, which wasn't present in the old version I had installed.

Issue with CharMap

Hi again, I am trying to use a different messages.htm file but it is throwing me this error.

fbcap ./messages.htm -f stats > stats_div.txt
Discovered chat thread with [*REDACTED*, *REDACTED*l]...
The streaming parser crashed due to malformed XML. Falling back to the less strict/efficient python html.parser. It may
take a while before you see output...
Traceback (most recent call last):
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line
76, in fbcap
    fbch = parse_data(parser_call)
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line
98, in parse_data
    return parser_call()
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", lin
e 125, in __init__
    self._parse_content(bs4)
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", lin
e 146, in _parse_content
    parser=XMLParser(encoding=str('UTF-8'))):
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\xml\etree\ElementTree.py", line 1297, in __next__
    for event in self._parser.read_events():
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\xml\etree\ElementTree.py", line 1279, in read_event
s
    raise event
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\xml\etree\ElementTree.py", line 1237, in feed
    self._parser.feed(data)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1752, column 1313

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Logan\AppData\Local\Programs\Python\Python35-32\Scripts\fbcap-script.py", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.7.post7', 'console_scripts', 'fbcap')()
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line
148, in main
    app.run()
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line
93, in fbcap
    raise e
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line
82, in fbcap
    fbch = parse_data(partial(parser_call, bs4=True))
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\main.py", line
98, in parse_data
    return parser_call()
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", lin
e 125, in __init__
    self._parse_content(bs4)
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\site-packages\fbchat_archive_parser\parser.py", lin
e 153, in _parse_content
    soup = BeautifulSoup(open(self.stream, 'r').read(), 'html.parser')
  File "c:\users\logan\appdata\local\programs\python\python35-32\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 29414: character maps to <undefined>

Not compatible with Hungarian language and/or date format

Downloading in Hungarian gives the following error.
Unexpected time format in "2016. j├║nius 16., 9:44 UTC+02". If you downloaded your Facebook data in a language other than English, then it's possible support may need to be added to this tool.

When downloading Hungarian messages in English all our special letters are displayed improperly (á é ö ő ó ú ü ű)
Thanks for the package though, it's awesome!

Error when parsing messages

Traceback (most recent call last):                                               
  File "/Users/miroslav/anaconda3/bin/fbcap", line 11, in <module>
    sys.exit(fbcap())
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/fbchat_archive_parser/main.py", line 127, in fbcap
    write(fmt, fbch, directory or sys.stdout)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/fbchat_archive_parser/writers/__init__.py", line 44, in write
    selected_writer().write(data, stream_or_dir)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/fbchat_archive_parser/writers/writer.py", line 17, in write
    return self.write_history(data, stream)
  File "/Users/miroslav/anaconda3/lib/python3.6/site-packages/fbchat_archive_parser/writers/text.py", 
line 19, in write_history
    ('-' * len(history.user)) + "-\n"
TypeError: object of type 'NoneType' has no len()

Happens every time regardless. Does anyone have any idea what's wrong?

AttributeError

Hi, when I run the command fbcap messages.htm I get the following error and the application halts. I have ensured that the messages.htm is in the correct directory.

C:\Python27\Scripts>fbcap messages.htm
Traceback (most recent call last):
  File "C:\Python27\Scripts\fbcap-script.py", line 11, in
    load_entry_point('fbchat-archive-parser==0+unknown', '
ap')()
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\ma
ain
  File "c:\python27\lib\site-packages\clip.py", line 652,
    self.invoke(self.parse(tokens))
  File "c:\python27\lib\site-packages\clip.py", line 634,
    self._main.invoke(parsed)
  File "c:\python27\lib\site-packages\clip.py", line 519,
    self._callback(**{k: v for k, v in iteritems(parsed) i
mmands})
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\ma
bcap
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\pa
 parse
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\pa
 _parse_content
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\pa
 _process_element
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\pa
 _parse_participants
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\na
155, in resolve
  File "build\bdist.win-amd64\egg\fbchat_archive_parser\na
134, in _manual_lookup
AttributeError: 'NoneType' object has no attribute 'get'

Ambiguous timezone offset found [CDT]?

Hello, so I have tried to run the tool but I keep getting this message

Ambiguous timezone offset found [CDT]. Please re-run the parser with the -z TZ=OFFSET[,TZ=OFFSET2[,...]] flag.(e.g. -t PST=-0800,PDT=-0700). Your options are as follows:
-> [-0400] for regions like America/Havana, Cuba
-> [-0500] for regions like America/Merida, America/Menominee, America/Matamoros

I don't live anywhere near Cuba -- I live in Missouri. How do I fix this?

Streaming Parser crashes when redirecting to a file

Hi there again! Thanks for the speedy response.

This bug is a little vague. When I try running

fbcap ./messages.htm -f json > file.json

This always happens

The streaming parser crashed due to malformed XML. Falling back to the less strict/efficient python html.parser. It may take a while before you see output...

However, if I do it without the '> file.json' the streaming works normally.

Thanks again!

Unexpected time format

Hi, I no sure why there is error like this pop out suddenly, but it seem like the format doesn't change but adding a location on behind, any method can work around?

Unexpected time format in "Tuesday, June 2, 1970 at 3:26pm Asia/Kuala_Lumpur". If you downloaded your Facebook data in a language other than English, then it's possible support may need to be added to this tool.

Please report this as a bug on the associated GitHub page and it will be fixed promptly.

Please add Brazilian portuguese locale support

I'm not into Python, so I'm trying my best to add pt_br support for fbcap. Great tool, thanks for the effort.

Anyway, after checking that arrow already supports pt_pt and pt_br locales, I've added this line to time.py without success, in the FACEBOOK_TIMESTAMP_FORMATS list:

("pt_br", "dddd, D de MMMM de YYYY [às] HH:mm") # Portuguese BR

Unfortunately, after installing the modified package with pip install . --upgrade, I get a re.py error:

Discovered chat thread with [xxxx, yyyy]...Traceback (most recent call last):
File "C:\Python27\Scripts\fbcap-script.py", line 11, in
load_entry_point('fbchat-archive-parser==0+unknown', 'console_scripts', 'fbcap')()
File "c:\python27\lib\site-packages\fbchat_archive_parser\main.py", line 188, in main
app.run()
File "c:\python27\lib\site-packages\clip.py", line 652, in run
self.invoke(self.parse(tokens))
File "c:\python27\lib\site-packages\clip.py", line 634, in invoke
self._main.invoke(parsed)
File "c:\python27\lib\site-packages\clip.py", line 519, in invoke
self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
File "c:\python27\lib\site-packages\fbchat_archive_parser\main.py", line 125, in fbcap
fbch = parser.parse()
File "c:\python27\lib\site-packages\fbchat_archive_parser\parser.py", line 102, in parse
self._parse_content()
File "c:\python27\lib\site-packages\fbchat_archive_parser\parser.py", line 128, in _parse_content
self._process_element(pos, element)
File "c:\python27\lib\site-packages\fbchat_archive_parser\parser.py", line 257, in _process_element
parse_timestamp(e.text, self.use_utc, self.timezone_hints)
File "c:\python27\lib\site-packages\fbchat_archive_parser\time.py", line 219, in parse_timestamp
timestamp = date_parser.parse(timestamp_string)
File "c:\python27\lib\site-packages\fbchat_archive_parser\time.py", line 94, in parse
return arrow.get(translated_timestamp, self.timestamp_format).datetime
File "c:\python27\lib\site-packages\arrow\api.py", line 23, in get
return _factory.get(*args, **kwargs)
File "c:\python27\lib\site-packages\arrow\factory.py", line 198, in get
dt = parser.DateTimeParser(locale).parse(args[0], args[1])
File "c:\python27\lib\site-packages\arrow\parser.py", line 157, in parse
match = re.search(final_fmt_pattern, string, flags=re.IGNORECASE)
File "c:\python27\lib\re.py", line 146, in search
return _compile(pattern, flags).search(string)
File "c:\python27\lib\re.py", line 251, in _compile
raise error, v # invalid expression
sre_constants.error: redefinition of group name u'd' as group 3; was group 1

Could you help me? Maybe I'm not installing everything correctly? I'm not well versed in the world of Python, so I can't start debugging the error. When I try running your program directly from source the interpreter complains that the source is including scripts from other directories and that is only allowed in packaged format.

AttributeError: 'NoneType' object has no attribute 'strip'

I just tried out fbcap on my messages.htm file. It begins to process the file, but then it dies as follows:

$ fbcap messages.htm
Discovered chat thread with [XXX,YYY,ZZZ]...Traceback (most recent call last):
  File "/usr/local/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.4', 'console_scripts', 'fbcap')()
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/main.py", line 66, in main
  File "/Library/Python/2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/Library/Python/2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/Library/Python/2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/main.py", line 27, in fbcap
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/parser.py", line 98, in __init__
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/parser.py", line 108, in __parse_content
  File "build/bdist.macosx-10.10-intel/egg/fbchat_archive_parser/parser.py", line 174, in __process_element
AttributeError: 'NoneType' object has no attribute 'strip'

I tried python 2.7 and 3.5.1 just for fun, same results.

Timestamp parsing error

Reccieved the following error message:
Unexpected time format in "5. november 2016 kl. 02:39 UTC+01". This program is optimized to accept English locale time formatting, but will try its best to parse other language time formatting at heavy efficiency costs. Apparently, the time stamp formatting your archive uses has somehow eluded even that.

If you downloaded your Facebook data in a different language, you can try temporarily switching your Facebook language settings to English (US), re-download, and try again. If that doesn't help, then please report this as a bug on the associated GitHub page.

The facebook profile uses Norwegian(Bokmål).

EDIT: I forgot to mention that your tool looks really awesome and i'm looking forward to testing it!

Faulty Username Parsing...?

First off thanks for this cool script.

Something seems to be wrong with the way you are parsing user names from the html. Not sure if facebook changed their html structure or if this something on my end or what.

I'm using the following command:

fbcap messages.htm -d parsed_msgs/

And get the following output (I removed the FB ID #'s in [email protected]):

Discovered chat thread with [[email protected], [email protected]]...Traceback (most recent call last):
  File "c:\program files (x86)\python\lib\runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\program files (x86)\python\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Program Files (x86)\Python\Scripts\fbcap.exe\__main__.py", line 9, in <module>
  File "c:\program files (x86)\python\lib\site-packages\fbchat_archive_parser\main.py", line 188, in main
    app.run()
  File "c:\program files (x86)\python\lib\site-packages\clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "c:\program files (x86)\python\lib\site-packages\clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "c:\program files (x86)\python\lib\site-packages\clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "c:\program files (x86)\python\lib\site-packages\fbchat_archive_parser\main.py", line 125, in fbcap
    fbch = parser.parse()
  File "c:\program files (x86)\python\lib\site-packages\fbchat_archive_parser\parser.py", line 102, in parse
    self._parse_content()
  File "c:\program files (x86)\python\lib\site-packages\fbchat_archive_parser\parser.py", line 128, in _parse_content
    self._process_element(pos, element)
  File "c:\program files (x86)\python\lib\site-packages\fbchat_archive_parser\parser.py", line 262, in _process_element
    self.current_sender))
Exception: Data missing from message. This is a parsingerror: 2013-06-02 17:22:00-07:00, None

So obviously it's getting "None" for the username. If I just add the line self.current_user = "Unknown" before the thrown exception the script runs and completes but it obviously just results in every user getting named "Unknown" in every conversation.

Error with newly downloaded FB archive.

I just downloaded my fb data just as the previous issue was raised, and waited until the fix for the January '18 format was released.
I've updated fbcap to 1.3.1, and get this error:

Traceback (most recent call last):
File "/usr/local/bin/fbcap", line 9, in
load_entry_point('fbchat-archive-parser==1.3.1', 'console_scripts', 'fbcap')()
File "/Library/Python/2.7/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/Library/Python/2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Library/Python/2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Library/Python/2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/Library/Python/2.7/site-packages/fbchat_archive_parser/main.py", line 119, in fbcap
progress_output=not noprogress, use_utc=utc, name_resolver=resolve)
File "/Library/Python/2.7/site-packages/fbchat_archive_parser/parser.py", line 517, in parse
return parser(handle, *args, **kwargs).parse()
File "/Library/Python/2.7/site-packages/fbchat_archive_parser/parser.py", line 258, in parse
self.parse_impl()
File "/Library/Python/2.7/site-packages/fbchat_archive_parser/parser.py", line 504, in parse_impl
participants = self.parse_participants(unescaped)
File "/Library/Python/2.7/site-packages/fbchat_archive_parser/parser.py", line 350, in parse_participants
if not participants.text:
AttributeError: 'str' object has no attribute 'text'

time zone error

I get the following error when I run it:

fbchat_archive_parser.parser.UnexpectedTimeZoneError:

Expected only PST/PDT time
zones (found Friday, 28 February 2014 at 15:14 UTC+05:30). This is a bug.

Facebook archive update

Facebook now have updated the way it archives the messages, So, Instead of putting all the messages in one file, it creates a file called messages.htm that contains hyperlinks with the name of the participants pointing to another folder called "messages" that contains all the conversations each participant in a single html file.

fbchat_archive_parser.parser.FacebookDataError

Getting error at a time of parsing the message.htm file
Related files are Attached
parse_file.zip

fbcap ./messages.htm > fbMessages.txt

Traceback (most recent call last):

File "/Users/coddict/anaconda/bin/fbcap", line 11, in <module>
  load_entry_point('fbchat-archive-parser==1.0.post1', 'console_scripts', 'fbcap')()
File "/Users/coddict/anaconda/lib/python3.6/site-packages/click/core.py", line 722, in __call__
  return self.main(*args, **kwargs)
File "/Users/coddict/anaconda/lib/python3.6/site-packages/click/core.py", line 697, in main
  rv = self.invoke(ctx)
File "/Users/coddict/anaconda/lib/python3.6/site-packages/click/core.py", line 895, in invoke
  return ctx.invoke(self.callback, **ctx.params)
File "/Users/coddict/anaconda/lib/python3.6/site-packages/click/core.py", line 535, in invoke
  return callback(*args, **kwargs)
File "/Users/coddict/anaconda/lib/python3.6/site-packages/fbchat_archive_parser/main.py", line 118, in fbcap
  fbch = parser.parse()
File "/Users/coddict/anaconda/lib/python3.6/site-packages/fbchat_archive_parser/parser.py", line 92, in parse
  self._parse_content()
File "/Users/coddict/anaconda/lib/python3.6/site-packages/fbchat_archive_parser/parser.py", line 117, in _parse_content
  self._process_element(pos, element)
File "/Users/coddict/anaconda/lib/python3.6/site-packages/fbchat_archive_parser/parser.py", line 250, in _process_element

"An unrecoverable parsing error has occurred (missing timestamp data)"
fbchat_archive_parser.parser.FacebookDataError: An unrecoverable parsing error has occurred (missing timestamp data)

Romanian locale compatability

Can you please add romanian locale compatability?

Unexpected time format in "27 martie 2017 la 20:22 UTC+03". If you downloaded your Facebook data in a language other than English, then it's possible support may need to be added to this tool.

Python 2 Unicode Character Error

Got the following error in Python 2:

UnicodeEncodeErrorTraceback (most recent call last)
/usr/local/bin/fbcap in <module>()
      9     sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
     10     sys.exit(
---> 11         load_entry_point('fbchat-archive-parser', 'console_scripts', 'fbcap')()
     12     )

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/main.pyc in main()
    151 def main():
    152     try:
--> 153         app.run()
    154     except clip.ClipExit:
    155         pass

/usr/local/lib/python2.7/dist-packages/clip.pyc in run(self, tokens)
    650                         tokens = shlex.split(tokens)
    651                 try:
--> 652                         self.invoke(self.parse(tokens))
    653                 finally:
    654                         self.reset()  # Clean up so the app can be used again

/usr/local/lib/python2.7/dist-packages/clip.pyc in invoke(self, parsed)
    632 		'''
    633                 self._ping_main()
--> 634                 self._main.invoke(parsed)
    635 
    636         def reset(self):

/usr/local/lib/python2.7/dist-packages/clip.pyc in invoke(self, parsed)
    517         def invoke(self, parsed):
    518                 # First invoke this command's callback
--> 519                 self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
    520                 # Invoke subcommands (realistically only one should be invoked)
    521                 for k, v in iteritems(parsed):

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/main.pyc in fbcap(path, thread, format, nocolor, timezones, utc, noprogress)
     96             generate_stats(fbch, sys.stdout)
     97         else:
---> 98             write(format, fbch)
     99 
    100     except AmbiguousTimeZoneError as atze:

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/writers/__init__.pyc in write(format, data)
     16                                           % format)
     17     item = getattr(writer_type, "%sWriter" % (format[0].upper() + format[1:]))
---> 18     item().write(data)

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/writers/writer.py in write(self, data)
     13     def write(self, data):
     14         if isinstance(data, FacebookChatHistory):
---> 15             return self.write_history(data)
     16         elif isinstance(data, ChatThread):
     17             return self.write_thread(data)

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/writers/csv.py in write_history(self, history, stream, writer)
     32             writer = self.get_writer(stream, True)
     33         for k in history.threads.keys():
---> 34             self.write_thread(history.threads[k], writer=writer)
     35 
     36     def write_thread(self, thread, stream=sys.stdout, writer=None):

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/writers/csv.py in write_thread(self, thread, stream, writer)
     38             writer = self.get_writer(stream, True)
     39         for message in thread.messages:
---> 40             self.write_message(message, thread, writer=writer)
     41 
     42     def write_message(self, message, parent=None, stream=sys.stdout,

/home/sean/git/fbchat-archive-parser/fbchat_archive_parser/writers/csv.py in write_message(self, message, parent, stream, writer)
     52             row[THREAD_ID_KEY] = "<unknown>" if not parent \
     53                                  else ", ".join(parent.participants)
---> 54         writer.writerow(row)

/usr/lib/python2.7/csv.pyc in writerow(self, rowdict)
    150 
    151     def writerow(self, rowdict):
--> 152         return self.writer.writerow(self._dict_to_list(rowdict))
    153 
    154     def writerows(self, rowdicts):

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa5' in position 1: ordinal not in range(128)

This error does not occur in Python3.

UnicodeEncodeError: 'ascii' codec can't encode character u'\xfa' in position 5: ordinal not in range(128)

$ fbcap ./messages.htm 
Discovered chat thread with  [,  ]...Traceback (most recent call last):
  File "/usr/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.7.post12', 'console_scripts', 'fbcap')()
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 164, in main
    app.run()
  File "/usr/lib/python2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/lib/python2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/lib/python2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 109, in fbcap
    raise e
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfa' in position 5: ordinal not in range(128)

CSV file output failing because of str vs. unicode

On Mac OS X Yosemite (10.10.5) with Python 2.7.10:

I get the following error when I try to output CSV's into separate files fbcap -f csv -d <output dir> <input file>. This happens with both my personal archive file and the sample test file.

Traceback (most recent call last):
  File "/Users/andrew/dev/auto-chat/venv/bin/fbcap", line 11, in <module>
    sys.exit(main())
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 188, in main
    app.run()
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 131, in fbcap
    write(format, fbch, directory or sys.stdout)
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 40, in write
    write_to_dir(selected_writer(), stream_or_dir, data)
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 68, in write_to_dir
    writer.write_thread(thread, stream=thread_file)
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/fbchat_archive_parser/writers/csv.py", line 56, in write_thread
    writer = self.get_writer(stream, True)
  File "/Users/andrew/dev/auto-chat/venv/lib/python2.7/site-packages/fbchat_archive_parser/writers/csv.py", line 45, in get_writer
    w.writeheader()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 141, in writeheader
    self.writerow(header)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 152, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
TypeError: must be unicode, not str

I did some hunting but could not find the root of the problem. I think it is something to do with TextIOWrapper's UTF-8 encoding (writers/__init__.py:67-68) and the csv.py library in Python 2?

--resolve not working?

--resolve seems to get stuck after entering your email. The password prompt ("Facebook password:") does not display at all and I'm forced to end the process.

What I'm trying to do is "fbcap ./messages.htm -f json --resolve > messages.json" and I'm on version 0.9.post3.

Unexpected time format

Hi, and greetings from Finland! I was trying to parse my messages.htm with this tool but it gave me error about time format, probably because I use Facebook in Finnish, and my downloaded data is also in Finnish. Output I'm getting:

Unexpected time format in "31. heinäkuuta 2016 kello 21:05 UTC+03". If you downl
oaded your Facebook data in a language other than English, then it's possible su
pport may need to be added to this tool.

Please report this as a bug on the associated GitHub page and it will be fixed p
romptly.

Language support

Is there ever going to be update so it work with languages other then English :)

Swedish time format

Swedish time format doesn't seem to be working. Would be great if this was possible to fix!

Czech locale

Hi,

I really appreciate your work on this. Could you please add support for czech time formats? I get following error:
Unexpected time format in "13. březen 2014 v 19:42 UTC+01". If you downloaded your Facebook data in a language other than English, then it's possible support may need to be added to this tool.

The timestamp in czech looks like this:
<span class="meta">13. březen 2014 v 19:42 UTC+01</span>
which translates to: "13. march 2014 at 19:42 UTC+01"

I've tried adding support myself, comming up with following format:
("cs_cz", "D. MMMM YYYY [v] HH:mm")

and running python ./setup.py develop as mentioned in this comment, but I still get the error (maybe because of non-ASCII characters, such as ř).

I would really appreciate if you could look into this, thank you.

Sidenote: I've noticed that, for some reason, facebook for some time used different forms of month names, such as 13. března instead of 13. březen, which might cause some issues as arrow library supports only one form. This affected only few messages in my file, but might cause trouble when adding official czech locale support (I myself replaced those in Notepad).

Polish locale

Hi, big thanks for you tool!
I have a problem with polish archive version:

Unexpected time format in "18 listopada 2016 o 22:20 UTC+01". If you downloaded your Facebook data in a language other than English, then it's possible support may need to be added to this tool.

I was researching it, the problem is with month names. For example in english we have months called this way:
January, February, March [...] November, December
Facebook translate month names in archive in polish genitive:
Stycznia, Lutego, Marca [...] Listopada, Grudnia
while arrow.get contains only simple month names:
Styczeń, Luty, Marzec [...] Listopad, Grudzień

Here is EXAMPLE

Using babel.dates could probably resolve this issue:

Format date with month name in polish in python

Babel dates documnatation

I was trying to implement it but I couldn't resolve problem I had:
AttributeError: 'str' object has no attribute 'tzinfo'

Could you look into this issue? I will be grateful!
Sorry for not marking this with bug label, seems I can't do it or don't know how to use it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.