Coder Social home page Coder Social logo

doc-pipeline's Issues

Empty output from generation jobs

Seeing some jobs with only the following files, which is missing the actual content:

docuploader > Sit tight, I'm tarring up your docs in ..
./
./docs.metadata
./xrefmap.yml
./_toc.yaml

Error preparing Java TOCs

docpipeline > Error processing docfx-java-pubsublite-kafka-0.2.0.tar.gz:

'dict' object has no attribute 'sort'

Reproduce locally:

$ gsutil cp gs://docs-staging-v2/docfx-java-pubsublite-kafka-0.2.0.tar.gz gs://my-bucket
$ SOURCE_BLOB=docfx-java-pubsublite-kafka-0.2.0.tar.gz SOURCE_BUCKET=my-bucket TRAMPOLINE_BUILD_FILE=./generate.sh TRAMPOLINE_IMAGE=gcr.io/cloud-devrel-kokoro-resources/docfx TRAMPOLINE_DOCKERFILE=docfx/Dockerfile ci/trampoline_v2.sh

Here is the .sort call:

# sort list of dict on dict key 'uid' value
toc.sort(key=lambda x: x.get("uid"))

Allow parallel builds to work

The goal is that we can replicate the behavior for FORCE_GENERATE_ALL but have it run faster for all the languages by having the builds run in parallel for each language. It will be dependent on #40 to be finished first.

Automatically file issues for failing tarballs

When the pipeline fails to process a tarball, we should automatically file an issue.

While we iron out the notifications, issues should be filed on this repo.

We can use flakybot to manage the issues.

Thoughts:

  • Does the docs.metadata include the repo?
  • We'd need to generate xUnit XML with a "fake" test case for each tarball.
  • Each tarball can be in a different repo. Do we need a different flakybot invocation for each repo, or can we update flakybot to magically handle that for us?
    • I'm leaning toward separate invocations -- the complexity stems from this repo and has not been needed on flakybot thus far (that I know of).
  • Can we depend on the fact tarballs are lazily generated? If a tarball fails to generate, can we always be sure that when it succeeds in the future, we'll tell flakybot about it?
    • What if we update the template, try to regenerate the HTML, and it fails? The pipeline won't automatically rebuild the tarball since the HTML already exists, so will it never self-heal and close the issue on its own? This could be really annoying for library owners.
  • What if the failure is caused by the pipeline? Should we notify the source repo? The doc-pipeline owners should be cc'd on all issues filed by this bot when the issue is filed on another repo.
    • If more than N tarballs fail to process, we should assume it's doc-pipeline's fault, not the tarball.

Dependency Dashboard

This issue contains a list of Renovate updates and their statuses.

This repository currently has no open or pending branches.


  • Check this box to trigger a request for Renovate to run again on this repository

Build failing due to missing six dependency

ImportError while importing test module '/workspace/tests/test_generate.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_generate.py:25: in <module>
    from google.cloud import storage
/h/.local/lib/python3.9/site-packages/google/cloud/storage/__init__.py:35: in <module>
    from google.cloud.storage.batch import Batch
/h/.local/lib/python3.9/site-packages/google/cloud/storage/batch.py:27: in <module>
    import six
E   ModuleNotFoundError: No module named 'six'

Let's add an explicit dependency to fix the build for now.

Re-instantiate blobs if we run into 404 issues

@cojenco helped looked into the issue where if by the time the Blob is instantiated and when we download tarballs inside the Blob using blob.download_to_filename(), it could fail if there is a "newer" version of the file that's re-uploaded. There is a generation parameter that gets included by default for the storage query. This can be avoided by re-instantiating the Blob if we run into a 404, which should minimize the amount of times we could fail with 404s.

Only fetch the exact xrefmap files needed for the current build

Tarballs can specify the xrefmaps they need using the xrefs field in docs.metadata. Let's use that field to specify the exact xrefmap files needed for the current build, rather than downloading every xrefmap for every build.

@jskeet came up with:

devsite://dotnet/Google.Api.Gax/2.5.0

We can convert that to an xrefmap by removing devsite://, replacing the first and last / with -, and adding .tar.gz.yml at the end. It will be an error if that xrefmap does not exist.

Another benefit of this is that one library can have multiple versions. Each version will have its own xrefmap. If every xrefmap is pulled in, there will be multiple xrefmap files that register the same UIDs. Plus, when we support multiple versions, we'll need to use just the right version of the xrefmap as URLs will be different.

Finally, this will benefit libraries without xrefs because they won't need to download anything.

@jskeet will implement the change to the dotnet libraries. I will implement the change to doc-pipeline.

Increase timeout from 10 hours

Running FORCE_GENERATE_ALL build timed out after 10 hours, processing 781 blobs out of 3045. Raw math, that's ~25% of the blobs in 10 hours, so I'll increase to 72 hours for now.

Request user to delete tmp directory when running tests

Running the test with existing tmp directory in the doc-pipeline directory can cause flakiness and unknown behaviors.

Instead of potentially prematurely deleting the tmp folder, the test should ask the user to get rid of it before running any tests.

Update published xref maps with correctly versioned URL

Once we have multi-version docs published, we should update our xref maps and regenerate content.

def get_base_url(language, name):
# The baseUrl must start with a scheme and domain. With no scheme, docfx
# assumes it's a file:// link.
base_url = f"https://cloud.google.com/{language}/docs/reference/" + f"{name}/"
# Help packages should not include the version in the URL.
if name != "help":
base_url += "latest/"
return base_url

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Error type: undefined. Note: this is a nested preset so please contact the preset author if you are unable to fix it yourself.

Enable Flakybot for nightly tests

Test bucket does not seem to be getting cleaned up properly

All new dependency PRs seem to have run into the test buckets having too many blobs from previous runs, and is not getting cleaned up properly. I'm not sure if someone is using the test bucket, but if it's not getting cleaned up properly we should look into why this is happening.

Automatically rebuild HTML when templates or YAML update

Right now, if you update the templates or the YAML of a package, the HTML won't get regenerated automatically. What if our default job changed to:

  1. Get all blobs.
  2. For every YAML blob:
    1. If no HTML version exists, generate it.
    2. Else if an HTML version exists and it was updated before the YAML blob, regenerate it.
    3. Else if an HTML version exists and it was generated before the latest commit to doc-templates, regenerate it.

Use stem configured in docs.metadata

The xref base URL assumes the default stem. There may be other places.

def get_base_url(language, name):
# The baseUrl must start with a scheme and domain. With no scheme, docfx
# assumes it's a file:// link.
base_url = f"https://cloud.google.com/{language}/docs/reference/" + f"{name}/"
# Help packages should not include the version in the URL.
if name != "help":
base_url += "latest/"
return base_url

generate: many tests failed

Many tests failed at the same time in this package.

  • I will close this issue when there are no more failures in this package and
    there is at least one pass.
  • No new issues will be filed for this package until this issue is closed.
  • If there are already issues for individual test cases, I will close them when
    the corresponding test passes. You can close them earlier, if you prefer, and
    I won't reopen them while this issue is still open.

Here are the tests that failed:

  • docfx-java-google-cloud-aiplatform-0.3.0.tar.gz
  • docfx-java-google-cloud-assured-workloads-0.3.1.tar.gz
  • docfx-java-google-cloud-bigquery-1.127.5.tar.gz
  • docfx-java-google-cloud-bigquery-1.127.6.tar.gz
  • docfx-java-google-cloud-compute-0.119.6-alpha.tar.gz
  • docfx-java-google-cloud-dns-1.1.2.tar.gz
  • docfx-java-google-cloud-functions-1.0.8.tar.gz
  • docfx-java-google-cloud-gcloud-maven-plugin-0.1.1.tar.gz
  • docfx-java-google-cloud-logging-logback-0.120.2-alpha.tar.gz
  • docfx-java-google-cloud-memcache-1.0.1.tar.gz
  • docfx-java-google-cloud-networkconnectivity-0.2.0.tar.gz
  • docfx-java-google-cloud-networkconnectivity-0.2.1.tar.gz
  • docfx-java-google-cloud-notification-0.121.7-beta.tar.gz
  • docfx-java-google-cloud-resourcemanager-0.118.10-alpha.tar.gz
  • docfx-java-google-cloud-retail-0.2.0.tar.gz
  • docfx-java-google-cloud-spanner-5.0.0.tar.gz
  • docfx-java-google-cloud-storage-1.113.12.tar.gz
  • docfx-java-google-cloud-workflow-executions-0.1.6.tar.gz
  • docfx-java-google-cloud-workflows-0.2.1.tar.gz
  • docfx-java-proto-google-cloud-orgpolicy-v1-1.1.1.tar.gz
  • docfx-java-proto-google-iam-v1-1.0.10.tar.gz
  • docfx-java-proto-google-identity-accesscontextmanager-v1-1.0.14.tar.gz
  • docfx-java-pubsublite-kafka-0.2.2.tar.gz
  • docfx-java-pubsublite-kafka-0.6.3.tar.gz
  • docfx-java-pubsublite-spark-sql-streaming-0.3.1.tar.gz
  • docfx-nodejs-speech-4.5.0.tar.gz

commit: 32ba472
buildURL: Build Status, Sponge
status: failed

Handle normalized semver versions for latest version handling

Similar to how normalized semver versions for Python has been an issue in the pipeline, it is also not getting picked up as the "latest" version when handling FORCE_GENEREATE_LATEST, and probably for xref bit as well. We should revert the Python versioning back to be semver-compliant to find the latest.

It will be a slightly complicated logic as we'll have to convert it to be semver compliant, but also keep note of the original versioning to pinpoint and pick up on that version later on if needed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.