cinemagoer / cinemagoer Goto Github PK

Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb (to which we are not affiliated in any way) movie database about movies, people, characters and companies

Home Page: https://cinemagoer.github.io/

License: GNU General Public License v2.0

Python 99.86% Makefile 0.14%

imdb movies actors cinema movie-database python database sql cast internet-movie-database

cinemagoer's Introduction

Cinemagoer (previously known as IMDbPY) is a Python package for retrieving and managing the data of the IMDb movie database about movies, people and companies.

This project and its authors are not affiliated in any way to Internet Movie Database Inc.; see the DISCLAIMER.txt file for details about data licenses.

Revamp notice

Starting on November 2017, many things were improved and simplified:

moved the package to Python 3 (compatible with Python 2.7)
removed dependencies: SQLObject, C compiler, BeautifulSoup
removed the "mobile" and "httpThin" parsers
introduced a test suite (please help with it!)

Main features

written in Python 3 (compatible with Python 2.7)
platform-independent
simple and complete API
released under the terms of the GPL 2 license

Cinemagoer powers many other software and has been used in various research papers. Curious about that?

Installation

Whenever possible, please use the latest version from the repository:

pip install git+https://github.com/cinemagoer/cinemagoer

But if you want, you can also install the latest release from PyPI:

pip install cinemagoer

Example

Here's an example that demonstrates how to use Cinemagoer:

from imdb import Cinemagoer

# create an instance of the Cinemagoer class
ia = Cinemagoer()

# get a movie
movie = ia.get_movie('0133093')

# print the names of the directors of the movie
print('Directors:')
for director in movie['directors']:
    print(director['name'])

# print the genres of the movie
print('Genres:')
for genre in movie['genres']:
    print(genre)

# search for a person name
people = ia.search_person('Mel Gibson')
for person in people:
   print(person.personID, person['name'])

Getting help

Please refer to the support page on the project homepage and to the the online documentation on Read The Docs.

The sources are available on GitHub.

Contribute

Visit the CONTRIBUTOR_GUIDE.rst to learn how you can contribute to the Cinemagoer package.

License

Cinemagoer is released under the GPL license, version 2 or later. Read the included LICENSE.txt file for details.

NOTE: For a list of persons who share the copyright over specific portions of code, see the CONTRIBUTORS.txt file.

NOTE: See also the recommendations in the DISCLAIMER.txt file.

cinemagoer's People

Contributors

Stargazers

Watchers

Forkers

mrcrabby hph orico yoyossy darklow etabard dmwyatt sixty4bit greenkudu bnagallo mystfit arthedian obazoud ptathavadkar mayankgautam prajaktatathavadkar tdeck emilyemorehouse tgmehar peterzainzinger oemmerson leighmacdonald skeyvani mefarazath mikkab ai-jarvis alex-game-of-2012 mrjohnsson77 digen kdeming albertz mahurtadoz accavdar qbektrix romamo ba1dr seedwithroot rochemarten oneslowpony bell-wang codynhat ejcaropr suman1051 hawsho fat-boy pietroborrelli marcelovani kevincong95 sbraz wumms siynsun trenthaynes saishredkar ptimson osmiyaki jiajin820 labyrinthofdreams bdhingra jamesypeng jewelryland bytearchive vivekallu coder-alpha ritwik8119 tuhinsherlock joshivj vyraun shirleycohen optionalg apelord tspecht dileep-dora asreekumar kris34697 piotrm0 dltacube louisk123 gtbai boompig aihill varundeboss sts0mrg0 seanreed1111 cnstudios swastik1996 grocer-of-despair gak kronickrose alvra sandertuit nelaturi mchristopher drbig colinsongf mrmoz1 jklaise yangdlnk redglory yiechen enriqueav

cinemagoer's Issues

Step by step install imdbpy

Hi all, can u help me, step by step install imdbpy on ubuntu 14 ?

imdbpy2sql.py never completes (OS X 10.7.5 Lion)

I've tried to run imdbpy2sql.py three times now and each time it gets to the stage "adding foreign keys" and stalls. I left it running for more than 24 hours then gave up. Running it again I can see it get to around 47/48 minutes CPU time in Activity Monitor (which is like a GUI version of the top command) then nothing happens. There's no disk activity and the memory usage is zero.

I set the computer never to sleep and ran the command through nohup in case that was a factor but the same thing happened.

As it's done most of the work except for the foreign keys, is there a way I can re-run it, skipping everything else and just doing the foreign keys?

Movie plot summaries page parser doesn't parse correctly

When tested with "The Matrix" no plot information is found. The markup must have changed.

Non-numeric season titles confuse the number of seasons

Some TV series have season titles which are not numeric. For example "Doctor Who 2005" (http://akas.imdb.com/title/tt0436992/combined) has a season titled "Unknown" which contains episodes that are not part of regular seasons. So the list of seasons for this series (as of May 2017) is [1,2,3,4,5,6,7,8,9,10,unknown]. IMDbPY only stores the number of seasons, in this case 11.

If one wants to get to the pages for the seasons, the link for the "unknown" season (http://akas.imdb.com/title/tt0436992/episodes?season=unknown) can not be generated. Worse, one might think that there would be a page URL such as http://akas.imdb.com/title/tt0436992/episodes?season=11

A probable solution would be to store the titles of the seasons as a list of strings.

Another related problem is regarding the number of seasons. In this example, I would suggest that the correct number of seasons should be 10 (since there is no season 11). This value could be reported as the largest numeric value in the season title list.

Movie color info parsed incorrectly

In the movie combined details page, the color info is parsed incorrectly for some titles. For the title with id '0063650' (the movie "If....") the color info is reported as ':(Eastmancolor) (uncredited)' when it should have been 'Color::(Eastmancolor) (uncredited)'

NULL movie_id in move_link file

 * LOADING CSV FILE /home/bbaumer/dumps/imdb/raw/movie_link.csv...
ERROR: unable to import CSV file /home/bbaumer/dumps/imdb/raw/movie_link.csv: null value in column "movie_id" violates not-null constraint
DETAIL:  Failing row contains (14300, null, 95357, 12).
CONTEXT:  COPY movie_link, line 14300: "14300,NULL,95357,12"

Using PostgreSQL and a recent copy of the data files.

Method name2imdbID() doesn't work + solution

Method name2imdbID doesn't work, there should nm instead of tt in _searchIMDb:
https://github.com/alberanid/imdbpy/blob/master/imdb/__init__.py#L873

Movie IMDb index in search results not parsed

Instead, the IMDb index becomes part of title. For example, search for the title "blink". The result includes an item "Blink (IV)". This text is interpreted as the title of the movie instead of the title being "Blink" and adding an imdbIndex key with the value "'IV".

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0

Getting the following error when running the sample provided in the README.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

--local-infile with MySQL?

I'm trying to use imdbpy2sql. When running

imdbpy2sql.py --mysql-force-myisam -d ~/dumps/imdb/raw -u 'mysql://root:<password>@localhost/imdb' -c ~/dumps/imdb/raw

I get the following error:

loading CSV files into the database
 * LOADING CSV FILE /home/bbaumer/dumps/imdb/raw/complete_cast.csv...
ERROR: unable to import CSV file /home/bbaumer/dumps/imdb/raw/complete_cast.csv: (1148, 'The used command is not allowed with this MySQL version')

The problem is that the --local-infile flag on my client is not on. Now, if I was writing the command myself, I could just add --local-infile=1 and it should work. But since imdbpy2sql is generating the mysql command for me, I can't add that option.

Could you add that option? Or an option to pass-through additional arguments to mysql?

Or is there a better solution? Any help would be appreciated.

[BTW, I am working on a derivative R package. See (https://github.com/beanumber/imdb/issues/3)]

Could SQLAlchemy be an optional dependency again?

I never use the SQL system and I wouldn't want to have to install SQLAlchemy to run IMDbPY. Installing it by default is OK but it would be nice to again have the option not to install it.

How to obtain the plot of a movie?

Hi,

My question maybe stupid, I am trying to obtain the plots of several movies, but I didn't find the right API to use.

I tried the following code:
the_matrix = ia.get_movie('0133093')
print the_matrix.get('plot outline') # works for me
print the_matrix['plot'] # doesn't work for me

I am wondering what is the right way to get the plot outlines, summaries, and synopses?

Constraint errors

Hey !

With the new version, I'm having two errors when building the foreign keys :

ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.<result 2 when explaining filename '#sql-dbf_47'>, CONSTRAINT `aka_title_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.<result 2 when explaining filename '#sql-dbf_4a'>, CONSTRAINT `movie_link_linked_movie_id_exists` FOREIGN KEY (`linked_movie_id`) REFERENCES `title` (`id`))

It seems that you try to insert aka and movie links to a title that doesn't exists.

Unicode error in actor names

Movie ID : 0060196

Line 90

for name in cast:
       print '      %s (%s)' % (name['name'], name.currentRole)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 11: ordinal not in range(128)

Notify users if the wrong version of Python is used

Since we now only support Python 3, we have to notify users if the wrong version of Python is used.

Allow local database update

A very interesting feature of this project is the possibility to download a local copy of the whole imdb database.

But it could be more interesting to update this database without downloading it again, maybe by looking the RSS feeds.

Duplicate persons in imdbpy2sql database, probably normal/canonical parser problem

Just noticed database contains two entries for same person:

   id    |        name         | imdb_index | gender | name_pcode_cf | name_pcode_nf | surname_pcode |              md5sum              
---------+---------------------+------------+---------+--------+---------------+---------------+---------------+----------------------------------
 1520700 | Toro, Guillermo del |            |  m      | T6246         | G4653         | T6            | 6fa07d318a205b82ef40a5c76db0974e
 2709933 | del Toro, Guillermo |            |         | D4362         | G4653         | D436          | 99b042e07b27b0321ca6e19f126c67f0

Could this be related to some bug in name parsing im imdbpy2sql script?
One name appears in normal way another in canonical.

First name has 100 movies
Second name has 69 movies

All of 169 movies belong to real Guillermo del Toro, which means that this must be one person instead of two.

In few days will update to latest text files and see if there are any changes, but i think bug can be because of middle "... del ..." name

What does "nr_order" represent?

I've tried searching, and I'm not getting anything useful from Google.

What does "nr_order" in the "cast_info" table mean?

AKA list not populating correctly..

Lets use imdb id tt4005402 as an example.. the original title of the movie was Colonia, The movie was renamed to The Colony...

http://akas.imdb.com/title/tt4005402/?ref_=fn_al_tt_1

Full list of AKA's is http://akas.imdb.com/title/tt4005402/releaseinfo?ref_=tt_dt_dt#akas

how ever if we do a search...

from imdb import IMDb
from pprint import pprint

ia = IMDb(accessSystem='http')
imdb_result = ia.search_movie('Colonia', results=5)

for res in imdb_result:
     pprint(res.data)

We only get the following results...

{'akas': [u'Colonia Dignidad'],
 'kind': u'movie',
 'title': u'The Colony',
 'year': 2015}
{'kind': u'tv series',
 'title': u'Colonia (2013) (TV Episode)  - Season 12 | Episode 3  - Espa\xf1oles en el mundo',
 'year': 2009}
{'kind': u'tv series',
 'title': u'Colonia (2011) (TV Episode)  - Season 2 | Episode 7  - Danni Lowinski',
 'year': 2010}
{'akas': [u'La colonia'],
 'kind': u'movie',
 'title': u'The Colony (I)',
 'year': 2013}
{'akas': [u'Colonia, brigada criminal'],
 'kind': u'tv series',
 'title': u'SOKO K\xf6ln',
 'year': 2003}

Where as the main page lists proper aka Original title, and is also listed in the aka section... where as the aka in your result is simply a one element list...

If you then pull the via the imdb id ia.get_movie(4005402) , The title moves from The Colony and becomes Colonia

The aka list then becomes..

[u'Colonia::France (imdb display title), International (imdb display title)',
 u'Colonia Dignidad - Es gibt kein Zur\xfcck::, Germany (imdb display title)',
 u'The Colony::UK (imdb display title)',
 u'\u0397 \u03b1\u03c0\u03bf\u03b9\u03ba\u03af\u03b1::Greece',
 u'\u041a\u043e\u043b\u043e\u043d\u0438\u044f \u0414\u0438\u0433\u043d\u0438\u0434\u0430\u0434::Russia',
 u'A kol\xf3nia::Hungary (imdb display title)',
 u'Amor e Revolu\xe7\xe3o::Brazil (imdb display title)',
 u'Colonia Dignidad::Chile (imdb display title)',
 u'Kolonija::Slovenia (imdb display title)']

Recieving imdb link in another format

After parsing twitter through its api .I am getting imdb link in this format https://t.co/pEoXfFP7Xc.So imdb id would be pEoXfFP7Xc.And after using ia.get_movie('pEoXfFP7Xc')it said format is invalid.

imdbpy2sql shows an error while finishing with foreign keys

adding foreign keys (this may take a while)
ERROR caught exception creating a foreign key: insert or update on table "aka_title" violates foreign key constraint "movie_id_exists"
DETAIL:  Key (movie_id)=(0) is not present in table "title".

 # TIME createForeignKeys() : 1min, 28sec (wall) 0min, 0sec (user) 0min, 0sec (system)

Is it ok receiving such an error about foreign keys while finishing imdbpy2sql script?
Is it just one FK failed or whole part of FKs?

Direct hit parsers can be removed from movie search pages

If I understand correctly what this feature does, it doesn't apply anymore. A search that results in one movie doesn't display the movie page. For example, searching for "Od+instituta+do+proizvodnje" displays a result page with only one movie in it. If that's really the case removing these parsers from the search pages would simplify the code.

invalid certificate of akas.imdb.com

Some requests get redirected to https; unfortunately the certificate for akas.imdb.com is not valid, so we must ignore it.

codename: simplify

IMDbPY contains a lot of legacy code and needs some new features.
Let's fix it in the master branch. :-)

If you need the old version (supporting Python 2.7), look at the imdbpy-legacy branch.

remove the "mobile" parser
remove SQLObject support
remove the cutils C module (keep it, but make it optional and off by default)
move to Python 3: #27
- http parser
- sql parser
introduce support for the new data set: #60
introduce python-requests for queries (to support sessions): #87
introduce documentation about those changes

Optionally:

remove the BeautifulSoup dependency (python-lxml will be required)
if possible, re-introduce Python 2.7 compatibility

getIMDB dict not being fully populated anymore

As an example of the following search does not populate the attributes genre, year , cast etc

This is for the imdb id 3315342

{'_Container__role': None,
 '_roleClass': <class 'imdb.Character.Character'>,
 '_roleIsPerson': False,
 'accessSystem': 'http',
 'charactersRefs': {},
 'current_info': ['main', 'plot'],
 'data': {'kind': u'movie',
          'plot': [u'In 2029 the mutant population has shrunken significantly and the X-Men have disbanded. Logan, whose power to self-heal is dwindling, has surrendered himself to alcohol and now earns a living as a chauffeur. He takes care of the ailing old Professor X whom he keeps hidden away. One day, a female stranger asks Logan to drive a girl named Laura to the Canadian border. At first he refuses, but the Professor has been waiting for a long time for her to appear. Laura possesses an extraordinary fighting prowess and is in many ways like Wolverine. She is pursued by sinister figures working for a powerful corporation; this is because her DNA contains the secret that connects her to Logan. A relentless pursuit begins - In this third cinematic outing featuring the Marvel comic book character Wolverine we see the superheroes beset by everyday problems. They are aging, ailing and struggling to survive financially. A decrepit Logan is forced to ask himself if he can or even wants to put his remaining powers to good use. It would appear that in the near-future, the times in which they were able put the world to rights with razor sharp claws and telepathic powers are now over.',
                   u"In the near future, a weary Logan cares for an ailing Professor X somewhere on the Mexican border. However, Logan's attempts to hide from the world and his legacy are upended when a young mutant arrives, pursued by dark forces."],
          'title': u'Help'},
 'infoset2keys': {'main': ['kind', 'title'], 'plot': ['plot']},
 'key2infoset': {'kind': 'main', 'plot': 'plot', 'title': 'main'},
 'keys_tomodify': {'alternate versions': None,
                   'business': None,
                   'crazy credits': None,
                   'dvd': None,
                   'faqs': None,
                   'goofs': None,
                   'laserdisc': None,
                   'news': None,
                   'plot': None,
                   'quotes': None,
                   'soundtrack': None,
                   'supplements': None,
                   'trivia': None,
                   'video review': None},
 'modFunct': <function modClearRefs at 0x7f5bb681acf8>,
 'movieID': '3315342',
 'myID': None,
 'myTitle': u'',
 'namesRefs': {},
 'notes': u'',
 'titlesRefs': {}}

0 results for keyword search

I'm getting 0 results for keyword searches with the package. Seems it's not functional?

please accept tspecht pull request

i need the Python 3 version

pep8

Any Python repos should be formatted to adhere to pep8.
https://www.python.org/dev/peps/pep-0008/

Some of the formatting standards are commonly ignored, however, such as

E501: line too long - lines should only no more than 79 characters in length

Import of data into MySQL db via imdb2sql hanging

My import of the data into the MySQL database using imdb2sql.py is getting stuck at the following:

building database indexes (this may take a while)
# TIME createIndexes() : 31min, 51sec (wall) 0min, 0sec (user) 0min, 0sec (system)
adding foreign keys (this may take a while)

Have tried it twice, but it keeps hanging at this point.
Any suggestions how to tackle this and whether the databases can be used by aborting at this point?

-Saish

Person akas not collected from search page

The HTML markup for person akas has changed from <em> to <i> but I haven't changed it in the parser because there seems to be inconsistency between how movie akas and person akas are handled. For movies, the parser returns:

(imdb_id, {'title': ..., 'akas': [...]})

whereas for persons the parser returns:

(imdb_id, {'name': ...}, [list_of_akas?])

The akas are a part of the dict in the movie result and they are the third element of the tuple in the person result.

Install without sql feature fails

repo freshly cloned,
OSX 10.11.12

flap at MacBook-Pro on master* ± python ./setup.py --without-sql  install                                                                                          ~/Dev/imdbpy 1 ↵ 
Created locale for: ar bg de en es fr it tr.
Traceback (most recent call last):
  File "./setup.py", line 238, in <module>
    setuptools.setup(**params)
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 137, in setup
    ok = dist.parse_command_line()
  File "build/bdist.macosx-10.9-x86_64/egg/setuptools/dist.py", line 275, in parse_command_line
  File "build/bdist.macosx-10.9-x86_64/egg/setuptools/dist.py", line 371, in _finalize_features
  File "build/bdist.macosx-10.9-x86_64/egg/setuptools/dist.py", line 785, in include_in
  File "build/bdist.macosx-10.9-x86_64/egg/setuptools/dist.py", line 414, in include_feature
distutils.errors.DistutilsOptionError: access to SQL databases is required, but was excluded or is not available

Importing data has broke in the latest commit

Commit: 170251d

python imdbpy2sql.py --mysql-force-myisam -d ~/Downloads/imdb-data/ -u mysql://user@localhost:/imdb-data

Traceback (most recent call last):
File "imdbpy2sql.py", line 3074, in
run()
File "imdbpy2sql.py", line 2939, in run
readMovieList()
File "imdbpy2sql.py", line 1533, in readMovieList
mid = CACHE_MID.addUnique(title, yearData)
File "imdbpy2sql.py", line 1137, in addUnique
else: return self.add(key, miscData)
File "imdbpy2sql.py", line 1012, in add
self[key] = c
File "imdbpy2sql.py", line 921, in setitem
self.flush()
File "imdbpy2sql.py", line 975, in flush
self.flush(quiet=quiet, _recursionLevel=_recursionLevel)
File "imdbpy2sql.py", line 975, in flush
self.flush(quiet=quiet, _recursionLevel=_recursionLevel)
File "imdbpy2sql.py", line 975, in flush
self.flush(quiet=quiet, _recursionLevel=_recursionLevel)
File "imdbpy2sql.py", line 976, in flush
self._tmpDict = secondHalf

update episodes - missing pilot

If you run the following commands:
import imdb
i = imdb.IMDb()
res = i.search_movie('blackadder')
r1 = res[0]
i.update(r1,'episodes')
print str(r1['episodes'])

You will get the following results:
{1: {1: <Movie id:0526541[http] title:_"The Black Adder" The Foretelling (1983)_>, 2: <Movie id:0526537[http] title:_"The Black Adder" Born to Be King (1983)_>, 3: <Movie id:0526539[http] title:_"The Black Adder" The Archbishop (1983)_>, 4: <Movie id:0526542[http] title:_"The Black Adder" The Queen of Spain's Beard (1983)_>, 5: <Movie id:0526543[http] title:_"The Black Adder" Witchsmeller Pursuivant (1983)_>, 6: <Movie id:0526540[http] title:_"The Black Adder" The Black Seal (1983)_>}}

which is missing the pilot episode ref: (http://www.imdb.com/title/tt0084988/episodes?season=1&ref_=tt_eps_sn_1)

Tested on latest version (cc63c25)

Mini series kind and years parsed incorrectly

The kind for mini series is reported as "tv series" (code documentation suggests that it would be "tv mini series"). Also, series years doesn't get parsed. Data to test: Band of Brothers (id: 0185906). The "series years" key should be "2001-2001" but it's not in the result.

Importing to Sql Server - text has encoding issues

Hi,

Thank you very much for a great tool.

I am using it to import latest IMDB csv files to local Sql Server Express 2014 database.
When I look at the data in DB I see text like this in title.title.:

"A PrÃ³xima VÃtima"
"DiscriminaciÃ³n en el lenguaje"
...

In name.name I see,

"Aarseth, Ã˜ystein"
"Abati, JoÃ«l"
...

Looks like something with encoding. The command I use to import is:
python.exe imdbpy2sql.py-d C:\imdb-files -u "mssql://connection text" --ms-sqlserver

I am using Sql Server 2014 express.
What can I do to fix it?

Thank you,

Eric

support for the new S3 dataset

IMDb has introduced a new alternative interface: http://www.imdb.com/interfaces

Now the files are stored in S3, and are more easier to process (there's also a lot less information than what was previously available).

We need a new access system to support the import of this data in a SQL database and retrieve them.

Don't import {{SUSPENDED}}

Is it possible to skip suspended title or flag them in the database ?

Thanks

Index of episode and ids of previous and next episodes can be collected

The combined page of a TV series episode contains the episode index (order of episodes within all episodes in all seasons, not just its own season) and links to the previous and next episodes. These can be added to the collected data.

search_movie "invalid syntax"

Hi,

Apologies for what is likely a case of user error.

I used "pip install imdbpy" to install the software. It completed without error.
I can run the following commands in ipython without error:

import imdb
ia = imdb.IMDb()

But when I try ia.search_movie('Jaws'), the result is always null ([ ]). This is true regardless of what movie I am searching for.

If I just import search_movie and try
search_movie 'Jaws'

I get the error, "Invalid syntax."

Can someone enlighten me?

Get colour info of movie

In [1]: from imdb import IMDb
In [2]: imdb = IMDb()
In [3]: m = imdb.get_movie('64276')
In [4]: m.data['color info']
Out[4]: [u'Black and White']

On imdb I see full dates:
http://www.imdb.com/title/tt0064276/
http://cl.ly/image/2N0c322T1i2u
The same for http://www.imdb.com/title/tt2076220/

In [6]: import imdb
In [7]: imdb.__version__
Out[7]: '5.1dev20150822'

Only years in dates of persons

In [1]: from imdb import IMDb
In [2]: imdb = IMDb()
In [3]: p = imdb.get_person('0000454')
In [4]: p.data['birth date']
Out[4]: u'1936'
In [5]: p.data['death date']
Out[5]: u'2010'

On imdb I see full dates:
http://www.imdb.com/name/nm0000454/
http://cl.ly/image/2F2G183C0K2q

In [6]: import imdb
In [7]: imdb.__version__
Out[7]: '5.1dev20141116'

business and literature parsers are broken

For 'http', the business parser is broken. Also 'literature' suffers of the same problem.

Movie taglines parser works only with lxml

The xpath used for the movie taglines parser cannot be handled on the beautifulsoup side. If lxml is not installed, the parser returns no tagline.

Python3 compatibility

As Guido von Rossum said this PyCon, everybody should start switiching to python3.

Videogames are indicated as "movie" in person filmography

The kind attribute of an element in the filmography of a person has the value "movie" for elements that are not movies at all.
Here is a simple test with actor Elijah Wood and the movie and videogame with the same title The Lord of the Rings: The Return of the King.

import unittest
from imdb import IMDb

class TestMovieKind(unittest.TestCase):

    def runTest(self):
        ia = IMDb(loggingLevel='error')
        person = ia.get_person('0000704') # Elijah Wood
    for movie in person.get('actor', []):
        if movie.getID() == '0387360':
                print movie['title'], '=> videogame for Xbox'
                self.assertNotEqual(movie['kind'], 'movie')
            if movie.getID() == '0167260':
                print movie['title'], '=> movie'
                self.assertEqual(movie['kind'], 'movie')

if __name__ == "__main__":
    unittest.main()

I can't get the correct indentation of the python code, sorry, when I paste it I lose it...

Timeout error IOError('socket error', timeout('timed out',))

When I try

ia = imdb.IMDb()
s_result = ia.search_movie(title)
first = s_result[0]
# get synopsis                                                                                                                             
ia.update(first, 'synopsis')

I get this error in many cases

CRITICAL [imdbpy] /usr/lib64/python2.7/site-packages/imdb/_exceptions.py:35: IMDbDataAccessError exception raised; args: ({'exception type': 'IOError', 'url': 'http://akas.imdb.com/title/tt0110647/combined', 'errcode': 'socket error', 'proxy': '', 'original exception': IOError('socket error', timeout('timed out',)), 'errmsg': 'timed out'},); kwds: {}

Any reason for this?

Python 3.X support

I am writing pythoon 3 project, and i want to use this awesome plugin. Do you plan in future to support python 3.0?

Kind entry for video games has inconsistent case with others

The kind for video games is given as "Video Game" whereas for other kinds the value is in lower case, as in "tv series" or "video movie".
Possible fix: Modify the _TITLE_KINDS dict in imdb/utils.py.
I'm not changing it since it might break compatibility with existing code.

movie.asXML Key Error: 'long imdb name'

I used imdbpy's .asXML method to generate a collection of XML documents that represent the all movies within the IMDB dataset (as of 3/2015). While this works for the vast majority of titles, I receive the following stack trace when attempting to get the XML representation of a title named "Los rosarios" (XML output from OMDB below). Presumably, this indicates that the long name for the director in question does not exist.

STARTING RUN WITH 8 PROCESSES
CREATING NODES FOR 3224547 MOVIES
ERROR: COULD NOT PARSE MOVIE WITH ID: 2680117
TRACEBACK:
Traceback (most recent call last):
File "generate_xml_nodes.py", line 53, in create_nodes
tree = ElementTree.XML(result.asXML(), parser)
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 1452, in asXML
value = self.getAsXML(key, _with_add_keys=_with_add_keys)
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 1441, in getAsXML
fullpath=tag))
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 1088, in _seq2xml
fullpath='%s.%s' % (fullpath, tagName))
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 1102, in _seq2xml
item.class.name.lower()))
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 1117, in _seq2xml
_l.extend(_tag4TON(seq))
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 963, in _tag4TON
crValue = cr['long imdb name']
File "/usr/local/lib/python2.7/dist-packages/IMDbPY-5.1dev_20150313-py2.7-linux-x86_64.egg/imdb/utils.py", line 1472, in getitem
rawData = self.data[key]
KeyError: 'long imdb name'

OMDB XML data for this movie:

<root response="True"><movie title="Los rosariazos" year="2007" rated="N/A" released="01 Sep 2007" runtime="N/A" genre="Documentary" director="Carlos López" writer="N/A" actors="Daiana Barrios, Pablo Bonel, Rafael Cao, Fernando Carazo" plot="N/A" language="Spanish" country="Argentina" awards="N/A" poster="N/A" metascore="N/A" imdbRating="N/A" imdbVotes="N/A" imdbID="tt1247280" type="movie"/></root>

Relevant code:

      localdb = imdb.IMDb('sql', uri='mysql://root:password@localhost/imdb')
      for mid in range(start, stop):
          moviefile = 'movie-%d' % mid
          moviefilepath = os.path.join(SAVE_PATH, moviefile + '.xml')
          if not os.path.isfile(moviefilepath):
              result = localdb.get_movie(mid)
              parser = etree.XMLParser(remove_blank_text=True, recover=True, huge_tree=True, encoding='latin1')
              try:
                  tree = ElementTree.XML(result.asXML(), parser)
                  imdbfile = open(moviefilepath, 'w')
                  imdbfile.write(etree.tostring(tree, pretty_print=True))
                  imdbfile.close()
              except:
                  print "ERROR: COULD NOT PARSE MOVIE WITH ID: %d" % mid
                  print "TRACEBACK:"
                  print traceback.format_exc()

vote details page has changed

The vote detail page has changed.

For example, see http://akas.imdb.com/title/tt0780504/ratings

This code will now return nothing:

from imdb import IMDb

ia = IMDb()
m = ia.get_movie('0780504', 'vote details')
print(sorted(m.keys()))
print(m.get('mean and median'))