Coder Social home page Coder Social logo

Comments (9)

yoheimuta avatar yoheimuta commented on June 11, 2024 1

@AlexCannonball Thank you for providing more details about your use case - it makes perfect sense now.
Your explanation actually reminded me that I implemented a workaround to create a temporary file and give it to protolint when I was working on intellij-protolint.

To support this feature with minimal effort, I think protolint should create a temporary file in the func doLint before it starts processing:

  1. checking if there's any input from stdin
  2. if so, create a temporary file, write the content to it
  3. append the temporary file path to the args slice
# This is a rough sketch:
diff --git a/internal/cmd/cmd.go b/internal/cmd/cmd.go
index 6e172b2..5048ccc 100644
--- a/internal/cmd/cmd.go
+++ b/internal/cmd/cmd.go
@@ -3,6 +3,8 @@ package cmd
 import (
 	"fmt"
 	"io"
+	"io/ioutil"
+	"os"
 	"strings"
 
 	"github.com/yoheimuta/protolint/internal/cmd/subcmds/lint"
@@ -73,9 +75,39 @@ func doSub(
 
 func doLint(
 	args []string,
+	stdin io.Reader,
 	stdout io.Writer,
 	stderr io.Writer,
 ) osutil.ExitCode {
+	// Read from stdin
+	input, err := ioutil.ReadAll(stdin)
+	if err != nil {
+		_, _ = fmt.Fprintln(stderr, "Error reading from stdin:", err)
+		return osutil.ExitInternalFailure
+	}
+
+	// If there's input, create a temporary file and write the content to it
+	if len(input) > 0 {
+		tmpfile, err := ioutil.TempFile("", "protolint-stdin-*.proto")
+		if err != nil {
+			_, _ = fmt.Fprintln(stderr, "Error creating temporary file:", err)
+			return osutil.ExitInternalFailure
+		}
+		defer os.Remove(tmpfile.Name()) // Clean up
+
+		if _, err := tmpfile.Write(input); err != nil {
+			_, _ = fmt.Fprintln(stderr, "Error writing to temporary file:", err)
+			return osutil.ExitInternalFailure
+		}
+		if err := tmpfile.Close(); err != nil {
+			_, _ = fmt.Fprintln(stderr, "Error closing temporary file:", err)
+			return osutil.ExitInternalFailure
+		}
+
+		// Append temporary file path to args
+		args = append(args, tmpfile.Name())
+	}
+
 	if len(args) < 1 {
 		_, _ = fmt.Fprintln(stderr, "protolint lint requires at least one argument. See Usage.")
 		_, _ = fmt.Fprint(stderr, help)

What do you think? Would this approach work for you?

from protolint.

yoheimuta avatar yoheimuta commented on June 11, 2024

Hi @AlexCannonball

No, unfortunately the linter does not support reading from stdin at the moment.

Thanks.

from protolint.

AlexCannonball avatar AlexCannonball commented on June 11, 2024

Hello @yoheimuta

Thank you for the reply. The reason I'm asking is because I'm trying to improve UX in VS Code extension (https://github.com/plexsystems/vscode-protolint).

At the moment the extension only shows up-to-date linting warnings when you open or save the protobuf document. If the document has some unsaved changes, obviously they are not yet in the file we pass to protolint.

It doesn't seem to be a big deal for Unix/macOS, I guess even something like this may work:

protolint lint /dev/stdin

However, Windows doesn't seem to provide similar option: https://stackoverflow.com/questions/7395157/windows-equivalent-to-dev-stdin

At the moment I don't see any cross-platform workaround. Even if I success to pass the "document" via Unix domain socket and Windows named pipe, it still needs to consider the platform when generating the socket/pipe name. And it still looks like an overengineering.

Unlike that reading the document text from stdin seems to be easier to implement in the extension. Maybe you consider adding this feature to the linter.

If you have any thoughts, let me know.

from protolint.

AlexCannonball avatar AlexCannonball commented on June 11, 2024

@yoheimuta thank you for sharing the sketch. I don't know the project and Golang well, so let me comment from protolint API client prospect.

  • "File mode" is awesome for some tasks, e.g. CI/CD or as protoc plugin. When your files are ready and done.
  • When you work with protobuf in IDE, ideally you'd like to lint the text on any change event. And there are tons of these events. protolint seems to be fast enough, as Golang is fast. However, the "file mode" API seems to become a bottleneck, as you need to manage files before linting.

As I understand from the sketch, it creates a real file on the disk. ioutil.TempFile seems to be deprecated, BTW: https://pkg.go.dev/io/ioutil#TempFile

My thoughts about disk files:

  • Disk I/O operations are slower than working with RAM. Maybe it would be too slow for real-time linting.
  • Intensive protobuf text changes in IDE causes a lot of disk operations, i.e. faster disk device degradation 😞
  • I guess, the linter still needs to load the whole text into the memory. Protobuf files are usually relatively small. Reading a text chunk-by-chunk to lint chunk-by-chunk doesn't make sense, as you need to see the whole document to detect all errors.
  • Temp file cleanup and possible filename conflicts - additional points of failure.
    • ioutil.TempFile seems to prevent filename clashes with randomization, however, we start depending on this function.
  • Creating a temp file looks redundant, just a workaround to fit "file mode" API.

What do you think about these?

Let me describe how I see an ideal approach (if we change protolint API):

  • No changes in "File mode". It's very good for its tasks.
  • Add a new command, like protolint lintstdin [args] for IDE real-time linting.
  • The new command tells the linter to read directly from stdin.

I was trying to find where loading from file to memory happens, and ended up in go-protoparser:
https://github.com/yoheimuta/go-protoparser/blob/c45546ae3e434eb92eb482faa3873e733c30af8d/lexer/scanner/scanner.go#L46

But I don't understand how many functions need to be added or changed for the "ideal" approach. Maybe just adding something like reader, err := bufio.NewReader(os.Stdin) when lintstdin command is set here?

reader, err := os.Open(f.path)

from protolint.

AlexCannonball avatar AlexCannonball commented on June 11, 2024

@yoheimuta hello, a quick update on my research regarding passing the text via named pipes. It works on Windows like this:
PS D:\protolint> .\protolint.exe lint \\?\pipe\mypipe.proto

However, it works with problems:

  • It's significantly slower than linting a usual file, takes seconds to see the linting results.
  • In the extension I run the server to fill the pipe with the text. When debugging nodejs I always see 22 consequent connections to the server to read \\?\pipe\mypipe.proto for 1 linting task. The first connection always fails with write EPIPE error after I write the text to the socket. Maybe it's an OS call to make sure that the pipe is ready 🤷 However, after that I always get 21 connections regardless the proto-file size. I wonder, maybe it's protoparser scanner/reader algo makes 21 consequent file passages which turns to 21 different connections to my nodejs server providing the text to the pipe?

from protolint.

AlexCannonball avatar AlexCannonball commented on June 11, 2024

Hello @yoheimuta
Do you need any details regarding this issue? I guess if protolint really reads the file 22 times for 1 linting command, it may cause heisenbugs when the file changes between these reading operations. Should I create a separate issue for this?
I guess reading the file 1 time and caching its contents in protolint memory could fix it.

As for linting from stdin, I think it's not necessary for VS Code extension, if the slow reading from the pipe is fixed. However, even if you decide to implement linting from stdin, the first read operation will empty stdin buffer, so the subsequent reading attempts may fail.

from protolint.

yoheimuta avatar yoheimuta commented on June 11, 2024

Hi @AlexCannonball

I guess if protolint really reads the file 22 times for 1 linting command

Yes, 375f265 intentionally has changed to read the most recent content, even if the fixer modifies the file along the way. This decision was based on my assumption that local access to a small file is fast enough to justify the extra read.

My thoughts about disk files:

While I understand your concern about potential performance issues with excessive disk access, I have not seen or heard of any problems with intellij-protolint, which creates a temporary file each time a source file is updated.

Just did some experimenting on my MacBook Pro and wanted to share my results with you!

❯ go run main.go -files=1000 -chars=1000
Time taken to create 1000 temp files with 10000 random characters each: 220.808876ms

❯ go run main.go -files=1000 -chars=100
Time taken to create 1000 temp files with 10000 random characters each: 308.157407ms

❯ go run main.go -files=1000 -chars=10000
Time taken to create 1000 temp files with 10000 random characters each: 356.361094ms

❯ go run main.go -files=1000 -chars=100000
Time taken to create 1000 temp files with 10000 random characters each: 1.173995759s

❯ go run main.go -files=10 -chars=1000000
Time taken to create 1000 temp files with 10000 random characters each: 87.575473ms
main.go
package main

import (
	"crypto/rand"
	"flag"
	"fmt"
	"os"
	"time"
)

func main() {
	var numFiles int
	var numChars int

	flag.IntVar(&numFiles, "files", 1000, "Number of temporary files to create")
	flag.IntVar(&numChars, "chars", 1000, "Number of random characters in each file")
	flag.Parse()

	start := time.Now()

	for i := 0; i < numFiles; i++ {
		content := generateRandomContent(numChars)
		err := createTempFile(content)
		if err != nil {
			fmt.Printf("Error creating temp file: %v\n", err)
			os.Exit(1)
		}
	}

	elapsed := time.Since(start)
	fmt.Printf("Time taken to create 1000 temp files with 10000 random characters each: %v\n", elapsed)
}

func generateRandomContent(length int) []byte {
	content := make([]byte, length)
	_, err := rand.Read(content)
	if err != nil {
		fmt.Printf("Error generating random content: %v\n", err)
		os.Exit(1)
	}
	return content
}

func createTempFile(content []byte) error {
	tempFile, err := os.CreateTemp("", "tempfile_")
	if err != nil {
		return err
	}
	defer tempFile.Close()

	_, err = tempFile.Write(content)
	if err != nil {
		return err
	}
	return nil
}

But I don't understand how many functions need to be added or changed for the "ideal" approach.

Implementing this feature would require a lot of code to write and maintain. If your idea relies on this feature, I recommend trying the workaround I suggested first, rather than spending extra effort on implementing the ideal approach. This is especially true if your concern isn't certain

from protolint.

AlexCannonball avatar AlexCannonball commented on June 11, 2024

Hello @yoheimuta and thank you for the reply.

Implementing this feature would require a lot of code to write and maintain

As I understand, this estimation is related to the feature with reading from stdin + making no temporary disk files. This totally makes sense, for VS Code extension I see at least one workaround which doesn't need stdin at all.

Yes, 375f265 intentionally has changed to read the most recent content, even if the fixer modifies the file along the way

Is adding a memory cache there also hard to maintain? By the memory cache I mean something like this:

  • Read the file A from disk to protolint memory once.
  • Make all necessary read and write operations with the text in memory.
  • Save the fixed text into A on disk.

If this change is acceptable, it removes extra reads from the target file URI. This hopefully fixes the slow reading the text via Nodejs IPC which I'm trying to use as a workaround for the VS Code Extension.

from protolint.

yoheimuta avatar yoheimuta commented on June 11, 2024

@AlexCannonball Thank you for sharing your thought.

Is adding a memory cache there also hard to maintain?

Unfortunately, I'm afraid that's the case.

from protolint.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.