Coder Social home page Coder Social logo

Comments (6)

RoxKilly avatar RoxKilly commented on June 10, 2024

@derekperkins what you're proposing is meant to support this right? (copied from protoc-gen-validate issue)

Suppose I have an array of messages with field id.

message Student {
  string id = 1;
}

message Class {
 // Could your suggestion be used to validate that students[n].id is not repeated within this array?
  repeated Student students = 1; 
}

Have you found a way to get this behavior with the current API?

from protovalidate.

derekperkins avatar derekperkins commented on June 10, 2024

Yes, this is exactly the use case I'm suggesting. No, I haven't solved this today with protovalidate, but I'm interested to hear back if you find a good workaround

from protovalidate.

rodaine avatar rodaine commented on June 10, 2024

You may be able to get away with this using what's available in CEL today on Class.students:

this.map(student, student.id).unique()

This uses the e.map CEL standard macro and our unique custom function.

from protovalidate.

matthewpi avatar matthewpi commented on June 10, 2024

You may be able to get away with this using what's available in CEL today on Class.students:

this.map(student, student.id).unique()

This uses the e.map CEL standard macro and our unique custom function.

I was able to get this to work by specifying the CEL expression on a message, rather than the repeated field. Here is a code sample:

message Environment {
	option (buf.validate.message).cel = {
		id: "repeated.unique_by_field"
		message: "repeated value must contain unique items, identified by the name field"
		expression: "this.variables.map(v, v.name).unique()"
	};

	// Variables to use within the environment. Each variable must have a unique `name` field.
	repeated EnvVar variables = 1;
//	[(buf.validate.field).cel = {
//		id: "repeated.unique_by_field"
//		message: "repeated value must contain unique items, identified by the name field"
//		expression: "this.map(v, v.name).unique()"
//	}];
}

message EnvVar {
	// Name of the environment variable. Must be unique.
	string name = 1;
	// Value of the environment variable.
	string value = 2;
}

The buf.validate.message CEL expression works like a dream, returning the following violation.

"violations": [
	{
		"fieldPath": "environment",
		"constraintId": "repeated.unique_by_field",
		"message": "repeated value must contain unique items, identified by the name field"
	}
]

However, as you might have noticed. The fieldPath is not what is wanted in this scenario. While the expression is being ran on environment, that expression only ever refers to the variables field, making it more difficult to properly match validation errors to the exact field they are related to.

If we use the buf.validate.field CEL rule instead of the one on the message, we hit an issue with the way CEL expressions are processed on repeated fields. It seems that when you manually specify a CEL expression on a repeated field, it is processed on each item within the associated array.

compilation error: failed to compile embedded type Environment for UpdateRequest.environment: compilation error: failed to compile expression repeated.unique_by_field: ERROR: \u003cinput\u003e:1:1: expression of type 'EnvVar' cannot be range of a comprehension (must be list, map, or dynamic)

| this.map(v, v.name).unique()
| ^

Assuming that the issue preventing the expression from being evaluated on the repeated field itself is resolved, there is just one other issue. Which is handling what field needs to be unique. The constraint ID can be used for localization, but right now it is just a generic repeated.unique_by_field which doesn't indicate any information regarding what field was used for the unique validation. There are two issues here, first is the fieldPath not being specific enough on what indexes failed the unique check (and the associated field potentially), and the constraintId not having a proper way of communicating what field the unique check is being ran on. Right now the best solution is to either have your message contain the field name, or if you need proper localization, to add the field name to the constraint key manually. But ideally there should be a proper solution for handling things like this in the future.

from protovalidate.

mivanovcpd avatar mivanovcpd commented on June 10, 2024

After looking at different ways of doing this, could we perhaps have an annotation to mark unique fields:

message Student {
  string first_name = 1;
  string last_name = 2;
  option (buf.validate.message).unique.keys = ["first_name","last_name"];
}
message StudentList {
  repeated Student students = 1 [(buf.validate.field).repeated.unique = true];
}

We could set a restriction on the types of the fields in unique.keys. Say it would be simplest if we only allow:

  • allow scalar(optional too), Message(provided it has (buf.validate.message).unique.keys option)
    Note: it's very important to allow optional here as it's crucial for many validations where you need to differentiate between missing and default value
  • do not allow: oneof, repeated, Message without (buf.validate.message).unique.keys option

Do you think this could work?

from protovalidate.

rodaine avatar rodaine commented on June 10, 2024

compilation error: failed to compile embedded type Environment for UpdateRequest.environment: compilation error: failed to compile expression repeated.unique_by_field: ERROR: \u003cinput\u003e:1:1: expression of type 'EnvVar' cannot be range of a comprehension (must be list, map, or dynamic)

@matthewpi, I think this is a bug on protovalidate-go's part (likely in the type checking). This should have worked. I'll take a look and see if I can repro and add some tests for it. [Edit: protovalidate-go#80]


There are two issues here, first is the fieldPath not being specific enough on what indexes failed the unique check (and the associated field potentially), and the constraintId not having a proper way of communicating what field the unique check is being ran on.

@matthewpi, there's #17 and #125 which aim to make it easier to get at the underlying info of a violation error. That said, with a custom CEL expression, you have control over the id and error message and can even construct it within the expression to use string formatting against the variables provided:

repeated EnvVar variables = 1 [(buf.validate.field).cel = {
  id: "env.vars.unique", // the ID can be very specific to the field in question
  expression: "this.map(v, v.name).unique() ? '' : 'variables are expected to be unique, got %s'.format([this.map(v, v.name)])"
}

We intentionally do not include any data in the standard constraint error messages or as part of the violation error as there are security and PII concerns with echoing potentially sensitive data. You are free to do so in custom expressions however, and we're planning to address the linked issues above in the near term.


We could set a restriction on the types of the fields in unique.keys.

@mivanovcpd, a multi-field uniqueness comparison can be problematic for languages that don't have equivalents of Java's equals and hashCode overrides. For protobufs in Go, for instance, you cannot use msg1 == msg2 as Go only performs address comparison for pointer types and further the messages contain internal state that may be different between messages even if their fields are identical (so the protobuf library includes a proto.Equal which is used for comparison). Likewise, Go does not allow overriding the hash function of a type. This means we cannot use a map to make the uniqueness check O(n) as we'll need to explicitly proto.Equal each member of the repeated field to the others (which would run in O(n^2)). Alternatively, especially with a subset of fields, we'd need to write a custom hashing utility explicitly for this purpose if we wanted to maintain an O(n) runtime.

That said, if we can determine a way to make this universally efficient and not a performance landmine, either in the case of uniqueness across the entire message or a subset of fields (using a FieldMask WKT for this would offer the most flexibility), I'm open to adding such a standard constraint.

from protovalidate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.