Coder Social home page Coder Social logo

Comments (11)

mwhudson avatar mwhudson commented on September 23, 2024

(by mwhudson)
This is already possible -- look at pydoctor.twistedmodel.TwistedSystem.privacyClass -- just not easy. A command line option would be nice.

from pydoctor.

mwhudson avatar mwhudson commented on September 23, 2024

(by magmatt)
This patch adds an --exclude option that accepts regular expressions. What think ye?

from pydoctor.

mwhudson avatar mwhudson commented on September 23, 2024

(by duncf)
I'm not able to apply that cleanly to trunk, and it looks like that code has changed a bit. :-(

from pydoctor.

tristanlatr avatar tristanlatr commented on September 23, 2024

an --exclude option that accepts regular expressions

This is a great idea

from pydoctor.

adiroiban avatar adiroiban commented on September 23, 2024

For reference...the content of the patch as found on Lauchpad https://launchpadlibrarian.net/72928008/exclude.patch

diff --git a/pydoctor/driver.py b/pydoctor/driver.py
index c520d07..62ec1cd 100644
--- a/pydoctor/driver.py
+++ b/pydoctor/driver.py
@@ -1,7 +1,7 @@
 """The command-line parsing and entry point."""
 
 from pydoctor import model, zopeinterface
-import sys, os
+import sys, os, re
 
 def error(msg, *args):
     if args:
@@ -98,6 +98,11 @@ def getparser():
         metavar='MODULE', default=[],
         help=("Add a module to the system.  Can be repeated."))
     parser.add_option(
+        '--exclude', action='append', dest='exclude',
+        metavar='EXCLUDEDPATTERN', default=[],
+        help=("Exclude packages and modules matching this regex.  "
+              "Can be repeated to exclude multiple packages/modules"))
+    parser.add_option(
         '--prepend-package', action='store', dest='prependedpackage',
         help=("Pretend that all packages are within this one.  "
               "Can be used to document part of a package."))
@@ -295,6 +300,14 @@ def main(args):
             else:
                 options.makehtml = False
 
+        # step 1.9: make regex excludes
+        if options.exclude:
+            try:
+                options.exclude = map(re.compile, options.exclude)
+            except re.error, e:
+                error("Exclude options must be valid regular "
+                      "expressions.  Error: %s" % e)
+
         # step 2: add any packages and modules
 
         if args:
diff --git a/pydoctor/model.py b/pydoctor/model.py
index caa15c3..1a55828 100644
--- a/pydoctor/model.py
+++ b/pydoctor/model.py
@@ -546,6 +546,10 @@ class System(object):
         if not os.path.exists(os.path.join(dirpath, '__init__.py')):
             raise Exception("you must pass a package directory to "
                             "addPackage")
+ 
+        if self.shouldExclude(dirpath):
+            return
+        
         package = self.Package(self, os.path.basename(dirpath),
                                None, parentPackage)
         self.addObject(package)
@@ -558,8 +562,18 @@ class System(object):
                 if os.path.exists(initname):
                     self.addPackage(fullname, package)
             elif fname.endswith('.py') and not fname.startswith('.'):
+                if self.shouldExclude(fullname):
+                    continue
                 self.addModule(fullname, package)
 
+    def shouldExclude(self, name):
+        '''Return True if this name is excluded'''
+        basename = os.path.basename(name)
+        for exclude in self.options.exclude:
+            if exclude.match(basename):
+                return True
+        return False
+
     def handleDuplicate(self, obj):
         '''This is called when we see two objects with the same
         .fullName(), for example:

from pydoctor.

tristanlatr avatar tristanlatr commented on September 23, 2024

So I have something working but I have some questions,

For now it works with the exact fullName of object. It does not allow to pass a regex.
I did not use regex for the initial implementation because then passing a module name would mark all the children as private too, which we don't always want.

So should I add --private-re and --exclude-re in addition ?

from pydoctor.

mthuurne avatar mthuurne commented on September 23, 2024

This is how Twisted currently implements it:

    def privacyClass(self, documentable):
        """
        Report the privacy level for an object.

        Hide all tests with the exception of L{twisted.test.proto_helpers}.

        param obj: Object for which the privacy is reported.
        type obj: C{model.Documentable}

        rtype: C{model.PrivacyClass} member
        """
        if documentable.fullName() == "twisted.test":
            # Match this package exactly, so that proto_helpers
            # below is visible
            return model.PrivacyClass.VISIBLE

        current = documentable
        while current:
            if current.fullName() == "twisted.test.proto_helpers":
                return model.PrivacyClass.VISIBLE
            if isinstance(current, model.Package) and current.name == "test":
                return model.PrivacyClass.HIDDEN
            current = current.parent

        return super().privacyClass(documentable)

So it marks everything inside packages named test as hidden, except for twisted.test and twisted.test.proto_helpers. As discussed in #315, I think it's fine and probably an improvement if we'd exclude test modules instead of marking them as hidden.

I would really like to get Twisted's use of pydoctor to zero code customization, since that would allow us to restructure the code without having to worry about breaking Twisted. So it would be good if an exclude mechanism would support Twisted's requirements.

You could probably handle Twisted's configuration with a regular expression, using negative lookahead assertions. But it would not be very intuitive. In general, I think that having to escape dots in regular expressions if you want to match module paths is not that user friendly. Maybe regular expressions are overkill and we could use fnmatch (glob patterns) instead? We could combine a --exclude option with an --include option, where the include overrides the exclude.

Another thing to consider is when it would be better to use a command line option and when it would be better to use markers inside docstrings. We discussed marking constructors as private in #305, but such a marker could also be placed in a module docstring to mark an entire module as private.

I think that a command line option makes sense if you might want to run pydoctor twice, with different options. For example the HTML generation would exclude test packages, while a lint run would include them. For efficiency, excluding can be done on file name, so pydoctor would save time not parsing the source code, while a marker inside a docstring can only be applied after the module has been parsed.

Therefore for excluding modules a command line option is better than a marker inside a docstring, but for marking modules as private that might not be the case. Do we have a use case for --private? I'd prefer to only add it if we know for sure that it is going to be used.

from pydoctor.

tristanlatr avatar tristanlatr commented on September 23, 2024

Thanks for all the feedback.

I think there is a bigger use case for --private rather than --exclude in my sens. Our tests for instance (pydoctor.test), It's good - for maintainers - to have them linked inside our docs, but as private elements.

I would personally rather use --private and --public so I think there is a use case. Some function in the standard library starts with an underscore but are actually public!

Also users might also want to make everything public, fnmatch would offer this option by passing * I guess, so that's good.

But the thing is with --private comes --public. Also I understand the --exclude use case too, I'll try to make that a "real" exclusion, i.e. ignoring completely the module. But what if a function name is passed ? Hiding it is more simple.

So I think another version that adds those 4 new arguments with the fnmatch function would meet the requirements?

from pydoctor.

mthuurne avatar mthuurne commented on September 23, 2024

I think there is a bigger use case for --private rather than --exclude in my sens. Our tests for instance (pydoctor.test), It's good - for maintainers - to have them linked inside our docs, but as private elements.

I originally thought as well that making them private would be better than excluding them, but in practice I never read test docs in HTML: I only read them when working on the tests, in which case I have the tests open in my editor already.

For a project like Twisted, where approximately half of the code is in tests, excluding tests can make documentation generation and publishing significantly faster. Twisted has discarded test docs for the last 5 years and I haven't heard anyone say that they missed them. However, discarding the test docs after parsing them is less efficient than excluding the sources instead.

I would personally rather use --private and --public so I think there is a use case. Some function in the standard library starts with an underscore but are actually public!

I think it would make more sense to mark those functions as public using a marker in their docstring rather than doing it from the command line, since the fact they are public should not change from one invocation of pydoctor to another.

Also users might also want to make everything public, fnmatch would offer this option by passing * I guess, so that's good.

But the thing is with --private comes --public and with --exclude comes --include. Also I understand the --exclude use case too, I'll try to make that a "real" exclusion, i.e. ignoring completely the module. But what if a function name is passed ? Hiding it is more simple.

I think it would be sufficient if --exclude and --include would only operate on modules. Perhaps they could even operate on directory trees instead of Python names. When dealing with native modules, file names would allow explicit control over whether the native module should be introspected or the Python implementation should be analyzed.

So I think another version that adds those 4 new arguments with the fnmatch function would meet the requirements?

I'm still not sure we need --private and --public if we'd have the ability to mark things as private or public in their docstrings.

In general I think it's easier to add features than to remove them, since it's easier to demonstrate that something is needed than that something is redundant. So I'd prefer to do marking of public/private in docstrings first and if we find cases where that is impractical, we can add command line options then.

from pydoctor.

tristanlatr avatar tristanlatr commented on September 23, 2024

Ok I get your point, but what about people that wish to make everything public ?

They would need the --public=* argument.

This is a valid whish, I will use that for projects where the apidocs are only for maintainers

from pydoctor.

tristanlatr avatar tristanlatr commented on September 23, 2024

Fix in latest version with new argument --privacy, see docs https://pydoctor.readthedocs.io/en/latest/customize.html#override-objects-privacy-show-hide

from pydoctor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.