Coder Social home page Coder Social logo

lighthouse-audit-service's People

Contributors

adamdmharvey avatar cristianrgreco avatar dan-kez avatar dependabot[bot] avatar erikxiv avatar mayigrin avatar odeit avatar spotify-kai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lighthouse-audit-service's Issues

Rethink how audits are grouped

Audits are currently most directly grouped by "website." this is useful to some extent, but inside Spotify the audits are sort of run as a named entity, something more like:

- name: authenticated-mobile-homepage
  url: https://some-site.com
  authenticated: true
  screen-size: mobile
- name: unauthenticated-desktop-list-view
  url: https://some-site.com/list/foo
  authenticated: false
  screen-size: desktop

We should allow this type of setup more directly, potentially by having some kind of stored audit id with the ability to re-run and schedule that audit. Anything without an existing audit id could probably have the url become the id and then we could deprecate or push people away from the website thing.

LHError: PROTOCOL_TIMEOUT

Everytime i try to run an audit i get the following error:
LHError: PROTOCOL_TIMEOUT

image

Not sure what to do with this, any help would be appricated?

OpenAPI/Swagger API documentation

We need Swagger and/or OpenAPI documentation for the endpoints in lighthouse-audit-service. it would be preferable to generate these somehow, considering we have TypeScript types annotating the expected requests and responses already.

Is this project dead?

Is it still being maintained? We are thinking about adopting lighthouse plugin in Backstage and wonder whether it is the best direction forward.

LHR summary on ListItem

Currently, we don't pass any LHR data in the Audit List. We could provide things like the category scores, which would be pretty useful. It would also be cool to be able to filter on those (show me audits with SEO scores less than 0.7 from the past week)

Support pagination on API responses

URLs which return list-based results do not do any pagination, leading to out-of-memory issues for websites with a grand audit history.

Pagination should be implemented on all API routes which return a list of items, with a sensible default (to gracefully integrate with the existing Backstage UI plugin)

My use case:
We automate lighthouse audits for our website 3 times a day, (as we deploy multiple times per day) however, the array has become so large that the lighthouse audit service pod is crashing because it doesn't have enough memory to handle the response from /v1/audits/:auditId/website

I propose pagination is added to every API route which returns an array with a default limit of 25 items per page.

[Security] SSRF using parameter ExtraHeaders leading to dangerous internal http call

Describe the bug
Hi team,
We found lighthouse-audit-service that use by Backstage as plugin can be use to send http request to arbitrary URL.
yes lighthouse is being use to do audit website, but it's dangerous because it can be use to send http request to internal network including http call to GCP metadata server to obtain sensitive information such as oauth token.

To Reproduce

  1. prepare a server that will be audited, this server will be redirect to desire internal endpoint.
    sample redirect handler

image

  1. send audit request to audit and add addtional parameter ExtraHeaders so everytime lighthouse-audit-service send http request the addtional header will be included,
    here is the image that can explain more
    image
  2. when audit done , we can fetch the response of internal http call captured in variable final-screenshot

GCP or any cloud provider has protection to prevent SSRF by add header validation, but since the lighthouse-audit-service allow parameter ExtraHeaders so attacker can add any header they want.

and as mentioned in the README.md that this project built by Backstage in mind so we reported it you Backstage but after dicussion with the Backstage team he refer us to report to spotify/lighthouse-audit-service

Thank you

Running several audits in parallel results in FAILED audit

Hi there!

I am trying to setup lighthouse service for our internal project, we are planning to constantly run audit for diferent URLs. I am using oficial image:

services:
  lh2:
    image: spotify/lighthouse-audit-service:latest
    environment:
      LAS_PORT: 1234
      PGUSER: user
      PGHOST: db
      PGPASSWORD: password
      PGDATABASE: dbname
    ports:
      - 1234:1234

When I do several POST requests with the following payload:

{
    "url": "https://www.google.com/",
    "options":
    {
        "awaitAuditCompleted": true,
                "chromePort": <randomPortNumber>
    }
}

I am always getting results only for the first successful response. the rest fails with the following error message in the logs:
error: failed while running lighthouse audit.

is it normal? or am I going smth wrong?

Thanks, Ivan

Investigate future Lighthouse 6 incompatibilities

An alpha release of Lighthouse 6 was released on March 11, so we should start looking into what the situation will be if we end up straddling between Lighthouse 5.6 and Lighthouse 6 LHRs. Also, TypeScript seems to be coming to Lighthouse, which is important to look into in terms of whether we can cut our custom types loose.

Export 'persistAudit' from audit api module to support custom lighthouse scripts

Hello, I've got a custom lighthouse script which runs outside of the audit service, however I would like to have it reported / tracked by the audit service.

To do this I'm installing this package as a dependency of my lighthouse script, but I need to make use of the 'persistAudit' function from the audit api module: https://github.com/spotify/lighthouse-audit-service/blob/master/src/api/audits/index.ts

Seeing as the other functions are exported, can this be added to the export list? It would allow me to upload the results of an independent lighthouse run to the lighthouse audit service database... Here is my relevant code snippet:

const lhAudit = require('@spotify/lighthouse-audit-service');
const uuid = require('uuid');
const pg = require("pg");
const lighthouse = require('lighthouse');

/* REDACTED */

// A custom run of lighthouse using the official lighthouse package
const result = await lighthouse(url, flags, config);

if (process.env.USE_DB === "true") {
    // Report to lhAuditService
    const audit = lhAudit.Audit.build({
        id: uuid.v4(),
        url: result?.lhr.requestedUrl,
        timeCreated: new Date(result?.lhr.fetchTime),
        timeCompleted: nowDate,
        report: result?.lhr
    });
    await lhAudit.persistAudit(        // This is what I want to do
        new pg.Pool({
            host: process.env.DB_HOST,
            port: process.env.DB_PORT,
            database: process.env.DB_NAME,
            user: process.env.DB_USERNAME,
            password: process.env.DB_PASSWORD
        }),
        audit
    );
}

As a workaround I've copied the source for persistAudit to this script.

WebSocket api for running audits

Hey guys! It seems to be pretty nice if there would be WebSocket api for Lighthouse audits. Now the list of audits can display status, but it doesn't update until page reloading. So, I want to get the actual status of current audits in real time. If you are looking for contributions I can handle it, but please let me know if it's a good idea.

integration with Backstage catalog service

Problem

At Spotify, we need the Lighthouse audits to tightly integrate with Backstage's catalog service, particularly when used with the Backstage plugin.

  • as a developer viewing a website component, I want to see all audits run for my website.
  • as a developer viewing a website component, I want to run new audits for my website.

Proposed Solution

Note: This solution could be a path forward for many of the services we open source which have a hard dependency on the currently internal catalog service @ Spotify.

I would propose that lighthouse-audit-service continue to work without a catalog id, but rather add a kubernetes-like metadata field to every entry. This could be a JSON field in the postgres DB, and you could look up items by their metadata. We could support nuanced operators to allow for startsWith, endsWith lookups.

# obviously, this would be url encoded; not encoding for the sake of readability
http://lighthouse/v1/audits?metadata={"foo": { "op": "=", "value": "bar" }}

Then, when Lighthouse audits are created via Backstage, we would create them with the catalog entry's id as metadata.

In the Backstage plugin, we'd still have the top-level view, but we would also add component-level and things-l-own-level views for viewing audits and their trends.

Create `website` list and get api

{
  "url": String,
  "time_last_audited": Date, // latest time_created
  "audits": {
    "items": [{
       "id": String,
       "iframe_url": String, // url that can be used in the iframe
       "url": String,
       "status": Status
       "time_created": Date,
       "time_completed": Date?,
       "categories": LHR.Categories?, // categories with their scores
       "report": LHR
    }],
    "total": Int,
    "limit": Int,
    "offset": Int
  }
}

List is the same, with report omitted.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.