pystruct / pystruct Goto Github PK

Simple structured learning framework for python

License: BSD 2-Clause "Simplified" License

Python 98.30% Shell 1.05% Makefile 0.07% Cython 0.59%

pystruct's Introduction

PyStruct

PyStruct aims at being an easy-to-use structured learning and prediction library. Currently it implements only max-margin methods and a perceptron, but other algorithms might follow.

The goal of PyStruct is to provide a well-documented tool for researchers as well as non-experts to make use of structured prediction algorithms. The design tries to stay as close as possible to the interface and conventions of scikit-learn.

You can install pystruct using

pip install pystruct

Some of the functionality (namely OneSlackSSVM and NSlackSSVM) requires that cvxopt is installed. See the installation instructions for more details.

The full documentation and installation instructions can be found at the website: http://pystruct.github.io

You can contact the authors either via the mailing list or on github.

Currently the project is mostly maintained by Andreas Mueller, but contributions are very welcome.

Jean-Luc Meunier (Naver Labs Europe) contributed a new model and did some maintenance, in the course of the EU READ project. See READ_Contribution.md

pystruct's People

Contributors

Stargazers

Watchers

Forkers

cshen wqren lfiaschi argod hushell vene zaxtax amueller larsmans kondra jnothman abhijitbendale xianghang derthorsten tesla1060 fgregg al13n321 avaziyi azizur77 robeth invinciblejha kelliew almath123 wavelets thomasp6t pombredanne chicham superfan89 djour bjanssen amelio-vazquez-reina martinsch jcatw pradeep-gnr tolga-b zixan xuanhan863 ruoshui1126 robbymeals zhangaustin chagge jjwangnlp wattlebird alexander-kirillov bboalimoe dataqc fubwar1 riyazbhat pengsun mwv mittald gitozi massmutual arpitagarwal729 johannah younging mapado xsr-thu huibinshen ltl315 nathan2718 ml-ai-nlp-ir aburkov shashankg7 tomorjack skypea rahul-khanna fmcc zbxzc35 kentchun33333 vishnumani2009 appcoreopc bermanmaxim arita37 edwintye nour-mws benkaehler kingjr nagyist jmschrei jlmeunier t-davidson happyphonon eedanny zxsted rjbashar shigekikarita iver56 zilongzhong dadawang shatu subramanyata tskatom alvations grseb9s alivcor leivo blitu12345 leonardogithub zetayue

pystruct's Issues

Problem with sample data

Hi, I am using windows XP for the record,
I have encountered the following problem while trying to load the sample data

from pystruct.datasets import load_letters
letters = load_letters()
---------------------------------------------------------------------------
EOFError                                  Traceback (most recent call last)
<ipython-input-2-5bb547bc0e79> in <module>()
----> 1 letters = load_letters()

C:\Python27\lib\site-packages\pystruct\datasets\letters.pyc in load_letters()
     16     module_path = dirname(__file__)
     17     data_file = open(join(module_path, 'letters.pickle'))
---> 18     data = cPickle.load(data_file)
     19     # we add an easy to use image representation:
     20     data['images'] = [np.hstack([l.reshape(16, 8) for l in word])

EOFError:

add node types

The GraphCRF should allow different nodes to take different types. This is more of a long-term goal as it is somewhat non-trivial, though.

Timestamps weird

The current timestamps_ attribute of the learners start with one absolute time stamp.
That is pretty confusing. The first entry should be stored in a separate attribute such that timestamps_ only contains the actually relative times.

Rename inference_method to inference in CRFs

As that just makes it less to type and the shorter name is unambiguous.

Test failure in latent SVMs

Got this from a fresh clone. Maybe assert_array_almost_equal should be used here?

======================================================================
FAIL: test_latent_svm.test_with_crosses_bad_init
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/scratch/apps/src/pystruct/tests/test_learners/test_latent_svm.py", line 80, in test_with_crosses_bad_init
    assert_array_equal(np.array(Y_pred), Y)
  File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 719, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 645, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

(mismatch 2.5%)
 x: array([[[0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 1, 0, 0],
        [0, 0, 0, 0, 1, 1, 1, 0],...
 y: array([[[0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 1, 0, 0],
        [0, 0, 0, 0, 1, 1, 1, 0],...

Add unit tests for standard datasets

We should test that they can be loaded and also processed.
They cython in loss-augmented prediction broke the snakes example as it used unsigned char for y :-/

make CRF functions that don't belong to the model interface private

Such as get_edge, get_pairwise_potentials etc.

DaiCRF doesn't respect edge direction.

There is a failing test, caused by an error in the DAI wrappers. @vene noticed something was awry. It seems the setting of the pairwise potentials is not as straight-forward as I thought. I think one might need to use calcLinearState.
Any help appreciated ;)

Weighted loss

It would be nice to be able to pass node weights for Hamming loss function, i.e. make some labels more important than the others. For example, if we are labeling superpixels of an image instead of pixels, we want to minimize the number of wrong pixels, not superpixels.

Window XP installation failed

Hi
by using pip install pystruct, I am receiving the following error

C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\xlocale(342) : warning    C
4530: C++ exception handler used, but unwind semantics are not enabled.    Specify
/EHsc

ad3/FactorGraph.cpp(21) : fatal error C1083: Cannot open include file: 'sys/time
.h': No such file or directory

Is that because sys/time not in win32 ?

Manage inference packages

The inference packages need be be installed way more pain free.
Possibilities:

include the DAI wrappers, check if dai is installed
include ad3
include scripts to fetch other solvers
implement message passing?

y shape in pystruct.models.EdgeFeatureGraphCRF

document says:
Labels y are given as array of shape (n_features)

Should the shape be (n_nodes,) ?

Cannot install on MacOSX

Hello! I'm very interested in using pystruct, and attempting to install on MacOSX 10.8.4. My python info is:

Python 2.7.5 (v2.7.5:ab05e7dd2788, May 13 2013, 13:18:45)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

I used easy_install to install pip, and the following happens when I attempt to install pystruct:

max$ sudo pip install pystruct
Password:
Downloading/unpacking pystruct
  Downloading pystruct-0.1.1.tar.gz (6.4MB): 6.4MB downloaded
  Running setup.py egg_info for package pystruct

    warning: no previously-included files matching '*.pyc' found under directory 'doc'
    warning: no previously-included files matching '*.pyo' found under directory 'doc'
    warning: no previously-included files matching '*.pyc' found under directory 'tests'
    warning: no previously-included files matching '*.pyo' found under directory 'tests'
    no previously-included directories found matching 'docs/_build'
    no previously-included directories found matching 'docs/auto_examples'
    no previously-included directories found matching 'docs/generated'
Downloading/unpacking ad3 (from pystruct)
  Downloading ad3-2.0.tar.gz (518kB): 518kB downloaded
  Running setup.py egg_info for package ad3

Downloading/unpacking pyqpbo (from pystruct)
  Downloading pyqpbo-0.1.tar.gz (76kB): 76kB downloaded
  Running setup.py egg_info for package pyqpbo

Installing collected packages: pystruct, ad3, pyqpbo
  Running setup.py install for pystruct

    warning: no previously-included files matching '*.pyc' found under directory 'doc'
    warning: no previously-included files matching '*.pyo' found under directory 'doc'
    warning: no previously-included files matching '*.pyc' found under directory 'tests'
    warning: no previously-included files matching '*.pyo' found under directory 'tests'
    no previously-included directories found matching 'docs/_build'
    no previously-included directories found matching 'docs/auto_examples'
    no previously-included directories found matching 'docs/generated'
    building 'pystruct.models.utils' extension
    clang -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/utils.c -o build/temp.macosx-10.8-intel-2.7/src/utils.o
    clang: error: no such file or directory: 'src/utils.c'
    clang: error: no input files
    error: command 'clang' failed with exit status 1
    Complete output from command /usr/bin/python -c "import setuptools;__file__='/private/tmp/pip_build_root/pystruct/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-8TrbT9-record/install-record.txt --single-version-externally-managed:
    running install

running build

running build_py

creating build

creating build/lib.macosx-10.8-intel-2.7

creating build/lib.macosx-10.8-intel-2.7/pystruct

copying pystruct/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct

copying pystruct/plot_learning.py -> build/lib.macosx-10.8-intel-2.7/pystruct

creating build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/downhill_simplex_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/latent_structured_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/n_slack_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/one_slack_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/structured_perceptron.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/subgradient_latent_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/subgradient_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

copying pystruct/learners/svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/learners

creating build/lib.macosx-10.8-intel-2.7/pystruct/inference

copying pystruct/inference/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/inference

copying pystruct/inference/inference_methods.py -> build/lib.macosx-10.8-intel-2.7/pystruct/inference

copying pystruct/inference/linear_programming.py -> build/lib.macosx-10.8-intel-2.7/pystruct/inference

creating build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/base.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/chain_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/edge_feature_graph_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/graph_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/grid_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/latent_graph_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/latent_grid_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/latent_node_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/multilabel_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/setup.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

copying pystruct/models/unstructured_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/models

creating build/lib.macosx-10.8-intel-2.7/pystruct/utils

copying pystruct/utils/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/utils

copying pystruct/utils/backports.py -> build/lib.macosx-10.8-intel-2.7/pystruct/utils

copying pystruct/utils/graph.py -> build/lib.macosx-10.8-intel-2.7/pystruct/utils

copying pystruct/utils/inference.py -> build/lib.macosx-10.8-intel-2.7/pystruct/utils

copying pystruct/utils/logging.py -> build/lib.macosx-10.8-intel-2.7/pystruct/utils

copying pystruct/utils/plotting.py -> build/lib.macosx-10.8-intel-2.7/pystruct/utils

creating build/lib.macosx-10.8-intel-2.7/pystruct/datasets

copying pystruct/datasets/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/datasets

copying pystruct/datasets/letters.py -> build/lib.macosx-10.8-intel-2.7/pystruct/datasets

copying pystruct/datasets/scene.py -> build/lib.macosx-10.8-intel-2.7/pystruct/datasets

copying pystruct/datasets/synthetic_grids.py -> build/lib.macosx-10.8-intel-2.7/pystruct/datasets

creating build/lib.macosx-10.8-intel-2.7/pystruct/tests

copying pystruct/tests/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests

copying pystruct/tests/test_libraries.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests

creating build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_binary_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_crammer_singer_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_edge_feature_graph_learning.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_graph_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_latent_node_crf_learning.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_latent_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_n_slack_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_one_slack_ssvm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_perceptron.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_primal_dual.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_structured_perceptron.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_subgradient_latent_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

copying pystruct/tests/test_learners/test_subgradient_svm.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_learners

creating build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_chain_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_directional_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_edge_feature_graph_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_graph_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_grid_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_latent_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_latent_node_crf.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

copying pystruct/tests/test_models/test_multilabel_problem.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_models

creating build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_inference

copying pystruct/tests/test_inference/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_inference

copying pystruct/tests/test_inference/test_exact_inference.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_inference

creating build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_utils

copying pystruct/tests/test_utils/__init__.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_utils

copying pystruct/tests/test_utils/test_utils_inference.py -> build/lib.macosx-10.8-intel-2.7/pystruct/tests/test_utils

running egg_info

writing requirements to pystruct.egg-info/requires.txt

writing pystruct.egg-info/PKG-INFO

writing top-level names to pystruct.egg-info/top_level.txt

writing dependency_links to pystruct.egg-info/dependency_links.txt

warning: manifest_maker: standard file '-c' not found



reading manifest file 'pystruct.egg-info/SOURCES.txt'

reading manifest template 'MANIFEST.in'

warning: no previously-included files matching '*.pyc' found under directory 'doc'

warning: no previously-included files matching '*.pyo' found under directory 'doc'

warning: no previously-included files matching '*.pyc' found under directory 'tests'

warning: no previously-included files matching '*.pyo' found under directory 'tests'

no previously-included directories found matching 'docs/_build'

no previously-included directories found matching 'docs/auto_examples'

no previously-included directories found matching 'docs/generated'

writing manifest file 'pystruct.egg-info/SOURCES.txt'

copying pystruct/datasets/letters.pickle -> build/lib.macosx-10.8-intel-2.7/pystruct/datasets

copying pystruct/datasets/scene.pickle -> build/lib.macosx-10.8-intel-2.7/pystruct/datasets

running build_ext

building 'pystruct.models.utils' extension

creating build/temp.macosx-10.8-intel-2.7

creating build/temp.macosx-10.8-intel-2.7/src

clang -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/utils.c -o build/temp.macosx-10.8-intel-2.7/src/utils.o

clang: error: no such file or directory: 'src/utils.c'

clang: error: no input files

error: command 'clang' failed with exit status 1

----------------------------------------
Cleaning up...
Command /usr/bin/python -c "import setuptools;__file__='/private/tmp/pip_build_root/pystruct/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-8TrbT9-record/install-record.txt --single-version-externally-managed failed with error code 1 in /private/tmp/pip_build_root/pystruct
Storing complete log in /Users/max/Library/Logs/pip.log

Any help would be greatly appreciated. Thank you!

Citation guidance

It would be good to add a note in the README about how we want this project to be cited in academic work.

Gradient computation in SubgradientStructuredSVM

Hi Andreas,

Hope you don’t mind if I ask some questions here. :) Those two are probably bugs, or result of my misunderstanding.

When you compute the gradient in SubgradientStructuredSVM::_solve_subgradient, instead of
```
grad = (psi_matrix - w / self.C / 2.) 
```
should be
```
grad = (psi_matrix - w / self.C) 
```
Otherwise you optimize the objective with a different value of C.
In the same function, when you compute the non-adagrad update, you store it in grad_old instead of self.grad_old, thus ignoring momentum.

redo constraint pruning in one_slack

rewrite constraint pruning in one_slack to be similar to n_slack, i. e. less convolved.

Learning 2D interactions

I am trying to reproduce the sample code as in plot_grid_crf.py with my custom data. My array X is of the shape (2,720,960,12), having two training samples of two images of sizes 720(rows)x960(colums) and class-wise probabilities being along the final axis(3). Similarly my array Y is (2,720,960) having pixel wise ground truth.

When I define the clf and crf objects the same way as the sample code, I get the following error :

Traceback (most recent call last):

  File "<ipython-input-5-3c2334b8ac36>", line 1, in <module>
    runfile('/home/prassanna/Development/workspace/Semantic-texton-forests/scripts/tempcrf.py', wdir='/home/prassanna/Development/workspace/Semantic-texton-forests/scripts')

  File "/usr/local/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 601, in runfile
    execfile(filename, namespace)

  File "/usr/local/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 73, in execfile
    builtins.execfile(filename, *where)

  File "/home/prassanna/Development/workspace/Semantic-texton-forests/scripts/tempcrf.py", line 158, in <module>
    clf.fit(X, Y)

  File "/usr/local/lib/python2.7/dist-packages/pystruct/learners/one_slack_ssvm.py", line 440, in fit
    joint_feature_gt = self.model.batch_joint_feature(X, Y)

  File "/usr/local/lib/python2.7/dist-packages/pystruct/models/base.py", line 40, in batch_joint_feature
    joint_feature_ += self.joint_feature(x, y)

  File "/usr/local/lib/python2.7/dist-packages/pystruct/models/graph_crf.py", line 194, in joint_feature
    unary_marginals[gx, y] = 1

IndexError: index 3 is out of bounds for axis 1 with size 3

I tried the same with -1 *log probabilities instead of probabilities and the same issue again. However, it works great with your sample data. I do not understand what's causing the problem.

Any help would be appreciated...

-Semicolon Warrior

Add test for DirectionalGridCRF

Because that doesn't seem to be tested at all.

SubgradientSSVM trained with n_jobs > 1 performing worse than SubgradientSSVM trained with n_jobs==1 after same number of iterations

writing some tests for SaveLogger functionality with n_jobs > 1, because currently my pool attribute causes this to fail, and came across what I think is a bug.
Significant differences in performance for SubgradientSSVM between training with n_jobs==1 and n_jobs > 1, in current code using Parallel. I haven't tested whether Pool implementation is not showing this behavior. Seems worse with more cores:

on my local machine, 4 cores:

local$ python subgradient_ssvm_bug.py
0.973684210526
0.921052631579

on aws server, 16 cores:

aws-server$ python subgradient_ssvm_bug.py
0.973684210526
0.710526315789

import numpy as np
from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split
from pystruct.models import GraphCRF
from pystruct.learners import SubgradientSSVM

if __name__ == '__main__':
    iris = load_iris()
    X, y = iris.data, iris.target

    X_ = [(np.atleast_2d(x), np.empty((0, 2), dtype=np.int)) for x in X]
    Y = y.reshape(-1, 1)

    X_train, X_test, y_train, y_test = train_test_split(X_, Y, random_state=1)

    pbl = GraphCRF(n_features=4, n_states=3, inference_method='unary')

    svm = SubgradientSSVM(pbl, max_iter=100)
    svm.fit(X_train, y_train)
    print svm.score(X_test, y_test)

    svm_par = SubgradientSSVM(pbl, max_iter=100, n_jobs=-1)
    svm_par.fit(X_train, y_train)
    print svm_par.score(X_test, y_test)

Add Latent SVM

We should have a separate Latent SVM model as in the digits example.
Basically it would only need slight modifications from the CrammerSinger model, with additions from LatentGraphCRF. That would make the example much faster and can be used in many cases.

cvxopt error with LatentGridCRF

With cvxopt 1.1.7 on a win64 machine I get an error while trying out the example (https://pystruct.github.io/auto_examples/plot_latent_crf.html#plot-latent-crf-py).

It says TypeError: Non-numeric type in list in \pystruct\inference\linear_programming.pyc in line 82 at A = cvxopt.spmatrix(data, I, J).

cvxopt alone work very well when trying to init sparse matrices since

from cvxopt import spmatrix
D = spmatrix([1., 2.], [0, 1], [0, 1], (4,2))

works.

speed up tests

I hate slow tests. Currently they take ~15 minutes on travis. WUT?
They should probably also be more systematic. I'm not sure why I test some models for lp and ad3. Not sure there is any gain there (though lp always returns "marginals", which ad3 doesn't do any more).

Compare constraint caching with SVM^struct

When doing benchmarks, I found out that SVM^structs one-slack solver can benefit from constraint caching for multi-class SVMs. I don't understand that, as I would have thought finding the most violated constraint would be less expensive then evaluating all cached inference results.
Needs investigation.

fix travis

Something with the multiprocessing seems to act up:
https://travis-ci.org/pystruct/pystruct

Make examples run without qpbo

There should be a safeguard against qpbo not being installed in the examples.

Make the LP solver use cvxopt directly instead of glpk

So we can eliminate the glpk dependency.
Needs some processing of the LP matrix, I think.

Helper functions for pretty returning of parameters

It would be nice if the user could get potentials returned as separate matrices of potential types.

I.e. for, graph_crf instead of

[1 2 -1 -2, 4 5 6]

we get

[numpy.array([[ 1 2 ],   
              [ -1 -2]],
 numpy.array([[ 4, 5], 
              [ 5, 6]])
]

Block-coordinate Frank-Wolfe algorithm

There's a paper at ICML 2013 "Block-Coordinate Frank-Wolfe Optimization for Structural SVMs".

The block-coordinate variant should be very easy to implement for you since it is basically equivalent to the stochastic subgradient based solver except that the step size is tuned in closed form. The paper uses a formulation based on lambda rather than C but I think one only needs to replace lambda everywhere by 1 and to clip gamma to [0, C] instead of [0, 1] (c.f. Algorithm 4).

Also this algorithm can be seen as a batch version of structured passive-aggressive. @vene told me he was thinking of implementing passive-aggressive but I think this block-coordinate algorithm should be better, since it can use the knowledge of previous iterations, unlike passive-aggressive.

Stopping tolerance scales with C

As the objective scales with C, the stopping tolerance of some estimators does, too.
Maybe the way the objective is scaled is not that great an idea after all?

negative_constraint broken

The negativity constraint was broken in a refactoring and needs attention :-/

fix initialization of EdgeFeatureGraphCRF

The init assumes that n_edge_features is set. The check for symmetric and anti-symmetric edge feature should go into initialize.

Encode symmetric matrices using scipy.spatial

Symmetric weight-matrices are currently transformed to flat vectors using boolean masks. I would rather use the scipy methods. I'll try to do that today. It will change the memory layout of the (flat) weights, though.

Remove EdgeType CRF

As this is more readily implemented using EdgeFeatureGraphCRF.
Need to rewrite DirectionalGridCRF before, though :-/

UnboundLocalError when fitting example with only one label

When fitting a data set with only one label (all entries in y are equal), the following error is raised:

UnboundLocalError: local variable 'objective' referenced before assignment

backtrace:

Training 1-slack dual structural SVM
iteration 0
no additional constraints
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
[..]

/usr/local/lib/python2.7/dist-packages/pystruct/learners/one_slack_ssvm.pyc in fit(self, X, Y, constraints, warm_start, initialize)
    520         primal_objective = self._objective(X, Y)
    521         self.primal_objective_curve_.append(primal_objective)
--> 522         self.objective_curve_.append(objective)
    523         self.cached_constraint_.append(False)
    524

It is reasonable that the model cannot be fit to this kind of labeled data but this situation should be caught and a more meaningful error should be raised.

Remaining issues in Frank-Wolfe implementation

There are still some minor issues that need to be fixed in the FrankWolfeSSVM:

don't recompute inference for duality gap in batch case
increase test coverage
benchmark
rename estimator to BCFW (should we? that is the name used in the paper)
implement averaging for batch version
Allow min-batches for parallel inference on multi-core computers.

actually implement options that MultiLabelClf promises

The docstring says that it will construct a full graph or a chow-liu tree if you tell it to... but that is not implemented :-/ it is basically some refactoring from the example.

Stack result of predict if possible

I think it would be nice to ensure the result of predict is an array if this is possible for the model. This would make integration of the multi-class and multi-label algorithms into scikit-learn smoother.
Maybe we could just hack it in by seeing if it is possible and leaving it if not.

website: fix links to classes in examples

There is this fancy script that creates links for all classes and functions in the examples to the documentation.
In scikit-learn that works for classes, in pystruct it doesn't :-/

unstructured svm crashes with n_jobs > 1

I think. If it does, it should throw at least a nice error.

Make tests robust to installed inference procedures

The test should check which algorithms are installed and only test those.

implement warm-start for inference

There should be a method to warm-start inference procedures from past iterations.
We need to cache the last result for each example and feed it to the inference procedure.
This is basically independent of the learner.
Maybe it could be completely put into the model.

First, we need to implement it in the inference procedures, though.
The LP should be able to benefit, and also opengm.
A slight API crux is that the result alone is not enough to warm-start, we also need dual solutions, messages, etc. depending on the method.

Nodes would be better called examples

In http://pystruct.github.io/generated/pystruct.models.GraphCRF.html#pystruct.models.GraphCRF

Node features are given as a tuple of shape (n_nodes, n_features), An instance x is represented as a tuple (features, edges) where edges is an array of shape (n_edges, 2), representing the graph.

Might be better as something like

Examples, i.e. X, are given as a tuple of length n_examples. An example, x, is represented as a tuple (features, edges) where features is numpy array of shape (n_nodes, attributes), and edges is is an array of shape (n_edges, 2), representing the graph.

Labels, Y, are given as tuple of length n_examples. Each label, y, in Y is given by a numpy array of shape (n_nodes,).

[bug] image_segmentation example

I've downloaded pickle files and I've tried to run this example, but I've got an error:

Traceback (most recent call last):
  File "image_segmentation.py", line 30, in <module>
    data_train = cPickle.load(open("data_train.pickle"))
ImportError: No module named latent_crf_experiments.utils

I haven't found such a name in pystruct sources.
pystruct was installed by command python setup.py install --user

Organize tests according to modules

There should be a test folder for the models and one for the learners.

Subgradient: batch_size=-1 yields results different from batch_size=len(X)

If I understand the documentation and the code correctly, setting the batch_size argument in the Subgradient SSVM to -1 or to the size of the training set should give the same results. This is not the case, due to line https://github.com/pystruct/pystruct/blob/master/pystruct/learners/subgradient_ssvm.py#L293
(If removing the "None", it works as expected.)