dohwg / draft-ietf-doh-dns-over-https Goto Github PK

View Code? Open in Web Editor NEW

51.0 51.0 19.0 367 KB

Discussion of draft-ietf-doh-dns-over-https in the IETF's DOH Working Group

Makefile 100.00%

draft-ietf-doh-dns-over-https's People

Contributors

Stargazers

Watchers

Forkers

martinthomson chantra samuelweiler elear hardaker aancw otakumonster zhaoshuli991 aniketh01 nonanet matjon fantuz saradickinson nusenu loganaden worenga cyberstormdotmu

draft-ietf-doh-dns-over-https's Issues

distinguish transport security from dnssec security in sec considerations

pr #25 did some of this but was backed out. see that diff and discussion

Clear up point about MTI server response format

The fact that an HTTP server can provide a response in the format of its choice was not well understood by DNS people. This motivates the MTI format for servers.

text on security considerations of not-doh-on-error

forked off from #85 which deals with error handling, text on what happens if you choose to handle errors in a non doh fashion related to mailing list thread: A question on the mix of DNS and HTTP semantics

Spontaneous pushes of DNS queries and poisoning

This protocol is a little bit special in the sense that clients that are DNS API capable will encounter servers that are able to push DNS requests and responses in the process of doing other things.

If a pushed response arrives, we need to make it clear that it would be poisoning to accept that response (and cache it).

To provide a concrete example, if a client has dns.example.net configured and it browses to www.example.com, then if www.example..com used server push to push a response that was otherwise a legitimate DNS API query, then that would NOT be treated as a DNS query, it would NOT enter the DNS cache. Though it might enter the HTTP cache, the DNS API client would never ask for that response.

New query parameter names

@mcmanus referring back to pull #19, you wrote this:

"I could go either way on body. You're right that the actual mime type is a bit of a problem.. I'll try it with a ct name argument without a value having the default meaning. I think that's explicit enough, while I don't really like totally implicit things like the absence of ct (what if you mis-spell it?)"

I'm fine with having body now named dns, but I'm concerned by the use of a the ct query parameter without a value at all...

Whilst this is probably too unwieldy:

/dns-query?ct=application/dns-udpwireformat&dns=AAABAAABAAAAAAAAA3d3dwdleGFtcGxlA2NvbQAAAQAB

the use of ct alone just seems...wrong:

/dns-query?ct&dns=AAABAAABAAAAAAAAA3d3dwdleGFtcGxlA2NvbQAAAQAB

Yes, I know it's valid according to RFC 3986, but I'm wondering whether, since it's very unlikely that a JSON format would be passed using a GET, we could treat the absence of ct as implying ct=application/dns-udpwireformat?

Alternatively we could allow a couple of special values for the ct parameter:

ct=binary - implies that the format is application/dns-udpwireformat
ct=json - implies that the format is application/simpledns+json

Text about future formats is wishy-washy

The discussion of Accept: header tries too hard to suggest that there will be future formats. We don't actually know that, and the examples in the current draft have confused some developers.

See PR #78.

http is stateless

we can provide a reminder that http is stateless and therefore reordering happens so don't use doh for a session protocol.

wireformat isn't mentioned until section 5

it could use a preview to help the reader orient herself

DNS64 and DOH

wrt mailing list thread DNS64 and DOH

andrew sullivan suggests modifying an existing paragraph in operational considerations to have one new sentence:

Local policy considerations and similar factors mean different DNS
servers may provide different results to the same query: for instance
in split DNS configurations [RFC6950]. It logically follows that the
server which is queried can influence the end result. Therefore a
client's choice of DNS server may affect the responses it gets to its
queries. In the case of DNS64 [RFC6147], the choice could affect
whether IPv6/IPv4 translation will work at all.

in my opinion that's fine. We don't need to enumerate everything, but DNS64 was already called out as a non goal so we might as well note this implication of that.

some content types wont want b64 in url

this is an evolution of #51

cache lifetimes

during the singapore meeting tale noted that the http caching model, where 1 lifetime is assigned to the rrset is not necessarily the way a dns cache would handle things.. each rr can have its own ttl (though one dns implementation does it this way).

perhaps worth some explanatory text when talking about rrset granularity and http caches.

A single DNS packet query per HTTP request

It is not stated crystal clear in the -01 draft if there's only one DNS packet per HTTP request or if it is allowed to send more than one. Some wordings (like the DNS query) suggest a single packet. I think it would benefit the document to be more explicit on this detail.

Negative caching

§5.0 could benefit from a reference to RFC 2308 to clarify that it's talking about negative answer caching.

Revalidation is a feature

The draft explicitly prohibits ETag, which suggests that we don't care for revalidation requests at the HTTP layer (and reliance on the resolver cache). I don't think that's necessary. Revalidation requests are a feature HTTP provides.

It might be that the cost of serving revalidation is comparable to the cost of making a genuine request. DNS responses are generally modest in size. But using that as the basis of a decision here might foreclose on some use cases, for which it might be annoying to have to revise the protocol to support.

Right now, I don't see a concrete reason to prohibit revalidation, ETag, Last-Modified, and the works. If we consider the DNS API server as authoritative for its view of the state of particular resource records, then all these things can be used. For instance, Last-Modified can refer to the time that the DNS API server last received a response (or even when it last received a response that changed its cached value, which is a different view on this).

That said, the rough model I described in #14 seems to suggest that DNS TTL limits what HTTP can do. If that model is correct, and both the DNS stack and API server cache right up to the TTL, then HTTP caching can't add any value.

Parameter names are pretty long

content-type could be much shorter. Bytes in requests matter and these are going to be used in critical conditions (early in the connection where the number of requests you can make depend on how big they are). These won't compress well enough with HPACK to get any real benefit.

A short name would be an improvement. A default value for content-type would be even better.

Discuss server flags?

What do we think about discussing server (and possibly) client flags explicitly? EG, is aDOH server allowed to be authoratative? Can a client set the CD bit and expect it's internal API to honor that and not serve from http caching? ...

guidance on get/post

as discussed on list..

1) Make one mandatory-to-implement
2) Make both mandatory-to-implement
3) Talk about why there is no mandatory-to-implemen

dnssec and doh

on list andrew s writes:

Section 6 of -03 says this:

Different response media types will provide more or less information
from a DNS response. For example, one response type might include
the information from the DNS header bytes while another might omit
it.

But section 9 says

     A DNS API client may also perform full

DNSSEC validation of answers received from a DNS API server or it may
choose to trust answers from a particular DNS API server, much as a
DNS client might choose to trust answers from its recursive DNS
resolver.

It seems to me that these are in tension with one another, because the
AD and CD bits are in the header that the response type is permitted
to throw away. Maybe it could be resolved thus:

NEW

     A DNS API client may also perform full

URI Templates

Right now DOH specifies static query parameters for its URLs to convey the content type of the DNS query as well as the actual query itself.

If DOH used URI Templates (implementation listing) to describe the URLs that clients accessed, deployments wouldn't be locked into using those particular query arguments; the information needed could be conveyed anywhere in the URL, including the path or even the hostname.

E.g., your DOH service's template could be:

https://doh.example.net/?ct={content_type}&dns={dns}

while mine could be:

https://doh.example.org/{ct}/{dns}

or many other things, like:

https://doh-{content_type}.example.org/whatever/?{dns}

This means that the protocol artefact used to configure a DOH service would be a URI template, not a URI.

include base64url example

I've seen an implementation report that confused base64 and base64url; they noted that the examples didn't include one where the distinction was important

May need a different ID for helping caching

https://tools.ietf.org/html/draft-ietf-dnsop-session-signal is now in WG Last Call in the DNSOP WG. That draft makes the ID of 0 special for certain DNS requests (not all requests), but if the draft is approved, those requests might also want to be passed in DOH. We should pick a suggested ID that is not 0. If we pick one like 0xFFFF, we get the extra benefit of not needing a different example to differentiate base64url and base64 encoding.

Allow >512 octets

Whether or not RFC 1035 has this limit, DOH should not.

.well-known

DOH defines a well-known URI for its operation. It's not clear why this is necessary.

Well-known URIs are supposed to be for site-wide metadata; they're not intended as bootstrapping mechanisms for unrelated protocols.

Specifically, why can't DOH just define a HTTP resource time that takes requests (with query strings) in a certain format, and return responses in a certain format? I.e., use a URL to locate a DOH service, rather than a (host, port) tuple.

Abstract

... says:

"""
DNS queries sometimes experience problems with end to end connectivity at times and places where HTTPS flows freely.

HTTPS provides the most practical mechanism for reliable end to end communication. Its use of TLS provides integrity and confidentiality guarantees and its use of HTTP allows it to interoperate with proxies, firewalls, and authentication systems where required for transit.
"""

Do we need to be talking about the motivation here? The way it's written sounds like it's using HTTP for tunnelling, which is something we wanted to avoid (especially if future efforts take away the message that HTTPS is the best tunnel).

META: editors' draft?

Do the editors publish an ED anywhere (hopefully in HTML)? If so, could it be linked to by README.md and/or next to the repo name?

incorporate more benefits text

on list jon mattsson points out that we don't mention some pretty key architectural benefits of doh such as authentication, amplification resistance, etc. on list as "changes for draft-ietf-doh-dns-over-https-03". #99 is meant to address this

DOH discovery document

If URI templates are adopted (see #74) and you have a way of discovering the templates -- i.e. a file format that carries the templates in it -- several interesting things become possible.

For example, if you advertised your DOH service as living at https://doh.example.net/, it might return a file like this:

  {
    "api": {
      "title": "DOH",
      "links": {
        "describedBy": "urn:ietf:doh:v1"
      }
    },
    "resources": {
      "urn:ietf:doh:v1:": {
        "hrefTemplate": "/?ct={ct}&dns={dns}",
        "hrefVars": {
          "ct": "urn:ietf:doh:v1:content_type",
          "dns": "urn:ietf:doh:v1:dns"
        },
        "hints": {
          "allow": ["GET", "POST"],
          "formats": {
            "application/dns-udpwireformat": {}
          },
          "acceptPost": ["application/dns-udpwireformat"]
        }
      }
    }
  }

(note that the details here are VERY handwavy; please don't get distracted just yet)

This tells the client that the service implements DOHv1, and exposes a single resource templated with the URI template /?ct={ct}&dns={dns}. The variables in the template are mapped to IETF URNs that uniquely identify their semantics.

Furthermore, it tells clients that the service allows both GET and POST requests, and accepts POSTs in the UDP wire format, and produces responses in the udp wire format as well.

(Aside: the above example happens to be in the format defined by JSON Home, which has been discussed extensively and implemented pretty diversely (so I think it could go to RFC relatively quickly), but that isn't to say that DOH has to use JSON Home; defining its own format might be more realistic, given milestones, etc.)

This would give DOH the ability to evolve more gracefully; new capabilities could be added by defining new formats, link relation types, etc., and be discovered at runtime by clients when they consume the document.

Mixing and matching APIs and extensions is much easier in this model too; because you're not locked into a particular pattern of URLs, and you're doing runtime discovery, versioning and extension doesn't need to be a monolithic v1...vn progression.

That can also be used for coordinating role-based access control, by the way. E.g., if you have a DNS update extension, but only want to expose it to authenticated users, you'd only put it in the documents you send to authenticated users.

I talk a little more about these benefits in this blog entry.

AFAICT the downside with this approach is that the discovery document means one more round trip, but I think that's manageable; it can be cached pretty aggressively, after all, and in the use cases the WG is considering, it would be fetched at browser (or app) startup time, so it's unlikely the latency would be user perceivable.

Proposition on HTTP semantics caching & TTL adjusting

The previous doh draft is unfriendly to HTTP cache in two ways:

The DNS cache information is not transparent to HTTP cache proxies (e.g. CDN)
The TTL values is not up-to-date if retrieved from an HTTP cache proxy

The recent -03 version introduced a Cache-Control header, solving the first problem, but the second problem is still unresolved.

The consequence of an out-of-date TTL value may cause TTL bloating, which means a downstream server generates larger TTL value than the upstream server, eventually keeping stale DNS records for longer time than expected.

I am proposing a solution to the second problem:

The server SHOULD send a Cache-Control with both public / private and max-age values.

If Cache-Control: public, the result MAY be cached for all users, generated when the authority nameserver is not ECS-aware, or the response contains ECS-scope = 0;
If Cache-Control: private, the result SHOULD NOT be cached for all users, generated when the authority nameserver is ECS-aware, or the response contains ECS-scope != 0.
For Cache-Control: max-age=X, X MUST be less or equal than the smallest TTL of all response records.
The server MUST send an optional Last-Modified and Date header.
The client SHOULD adjust the relative TTL according to Last-Modified and Date.

If the client is a caching nameserver, providing access for other downstream servers, it MUST adjust the TTL value.
The rule is described as follows:
- Do nothing if LastModified >= Now.
- TTL = floor( TTL - (Now - LastModified) ),
  where floor means rounding towards negative direction.
- If TTL becomes negative, the client MUST either set it to zero,
  or OPTIONALLY resend the request in limited retry count.
- Be careful, since not all TTL fields represent time values. [Section 6.1.3, RFC 6891]
The TTL adjustment protocol only applies when ct=application/dns-udpwireformat.

I would like to hear some reply on my proposition. And if it is suitable, I hope it can be adopted in the next draft version.

Don't require HTTP/2

I think that it is fine - or more precisely, invaluable - to explain what the semantics of server push are. But you don't actually require HTTP/2 for this protocol to work.

The only thing that requiring HTTP/2 gets you is tighter constraints on TLS usage. You can get those same constraints with some deployment recommendations.

I would support strong recommendations that said HTTP/2 is useful for its multiplexing as well as similarly strong advice about following the advice in RFC 7525. Those are both good practice.

More clarification on example

Maybe just me, but I would prefer more clarification on the request format:

<33 bytes represented by the following hex encoding>
abcd 0100 0001 0000 0000 0000 0377 7777
0765 7861 6d70 6c65 0363 6f6d 0000 0100
01

Is this directly from RFC1035? If so, perhaps a (very brief) explanation of the encoding, referencing the relevant standard would be useful. Also, perhaps an additional example showing EDNS(0) extensions being passed?

Easing Transitions to new GET media types

This probably fell through the cracks when #74 was closed.

The current design of the GET-based request uses two fields that work in tandem. ct carries a content type, dns carries the query. This is suboptimal from a performance perspective when a new request format is developed, which would effectively discourage adoption of new formats (though most likely it will just prolong the lifetime of the baseline format).

A query in the form dns=<blah> with an implicit content type would be ideal. New formats for requests could do this: dns=<blah>&v2=<foo>.

POST requests can't benefit from this sort of approach. Not sure what I think about that.

Referenced CORS standard obsolete

As I understand it, the referenced CORS standard (https://www.w3.org/TR/cors/) is obsolete, and has been superseded by the WHATWG fetch spec (https://fetch.spec.whatwg.org/). Whether this is the official W3C position, I'm not clear, but the WHATWG folks certainly seem to be running with the ball.

Please also consider Google DOH protocol as reference (w/ implementation)

The DOH draft proposed two formats, which are application/dns-udpwireformat and
application/simpledns+json. I would like to talk about the latter, which is not
defined yet, but already noted with the following references:

I want to suggest that, Google is offering DNS-over-HTTPS in another JSON
format. please consider this format in the standardization process of DOH JSON
format. The the API specification is at:

https://developers.google.com/speed/public-dns/docs/dns-over-https

The difference of this format from bortzmeyer's is that, it has a uniform
presentation for various different RR types by not splitting up the data field,
thus easier to implement than some of other formats (as easy as taking the 5th
text field from the BIND zonefile).

As far as I knew, there are already some implementations supporting this
protocol, two note worthy clients are CoreDNS [1] and Dingo [2]. Dingo also
supports the OpenResolve protocol (similar to bortzmeyer's protocol).

I have implemented a DOH server / client suite [3] using the Google DOH
protocol. I extended the protocol with absolute TTL times to provide HTTP cache
semantics. This implementation has been put into stable public service for a
reasonable time. I would like to cooperate with the standardization with DOH
protocol, by implementing and experimenting with these formats, and hear
suggestions from you about the future development of DOH protocols.

[1] https://coredns.io/plugins/proxy/
[2] https://github.com/pforemski/dingo
[3] https://github.com/m13253/dns-over-https

Query Parameters - reference

The spec refers to "query parameters", but does not specify their syntax.

The most current reference for its definition is:
https://url.spec.whatwg.org/#application/x-www-form-urlencoded

"DOH client" and "DOH server" terminology

Even though the Terminology section contains "DNS API client" and "DNS API server" respectively, the -04 version of the draft still contains a few instances of "DOH client" and "DOH server". Not sure that's intended.

tools.ietf.org doesn't parse Section 9 references

In section 9 of DOH, it makes the following references:

Sections 10.6 (Compression) and 10.7 (Padding) of [RFC7540] provide some further advice on mitigations within an HTTP/2 context.

This isn't one of the formats that the tools process will convert into deep links. Instead, the "10.6" becomes a link to section 10.6 of DOH (which doesn't exist), while 10.7 doesn't become a link at all.

OCSP stapling is required to prevent circular dependency for CRL/OCSP validation

In order to prevent a dependency on traditional DNS or a DNS query for CRL/OCSP certificate validation check, OCSP stapling should be used for the HTTPS web server.

Is application/simpledns+json a real content type?

If it isn't then including it in an example might be unwise. Explicitly pointing out that it is currently fictitious might avoid confusion.

What is the URL?

The draft gets stuck into describing pieces of the URL, but neglects to point out some obvious things, like what the URL is for. It could probably use some more clarity about the configuration and discovery point as well.

clarify http errors vs dns errors

largely a restating of mailing list thread: A question on the mix of DNS and HTTP semantics

I'm certainly happy to add a sentence or two to help make that more clear (suggestions?) but I don't want to start defining/reiterating HTTP semantics in the DoH doc - we've already got lots of documents (i.e. rfc723x.. which are all included normatively by 7230 which doh includes normatively) that do that and its not in doh's charter to mess those things anyhow.

Describe discovery process

The draft says:

Using the well-known path allows automated discovery of a DNS API Service,

But it probably needs to describe at least one way to perform that automated discovery. It's probably not a good idea to have people attempt to guess.

Define the caching model

The draft doesn't really tackle the question of caching fully right now. I think that there is an implied hierarchy of caches that are in use.

The query first hits the DNS stack, which maintains its own cache. Using the DNS TTL, that cache might decide to answer the query.

Then there is an HTTP stack, which might maintain its own cache. Notably, none of the HTTP cache-busting techniques work from the perspective of the DNS client, so we have to be careful to observe that. This particularly ordering makes #13 much worse.

Finally, a DNS API server might maintain its own cache.

Describe server push semantics

Describe how GET (not POST) is used with server push.

This will ensure that you answer the bigger question: what do you do with a server push from an arbitrary server? I assume that this won't be privileged in any way, and only the configured DNS API server will have its answers routed to the DNS stack, but that needs to be explicit.

Clarification on Base64 padding characters

In Section 5.1, it says

Padding characters for base64url MUST NOT be included.

Presumably this is because, in GET format, this would result in extraneous = signs in the URL? What happens if someone (or an automated system) pastes a Base64 request including padding characters?

Use Cases

@mnot started a mailing list thread titled Use Cases. Track actions here. (continue to discuss on list)

That speaks to the motivations, but not how it will actually be achieved using this protocol. I suggest:

"""
3. Use Cases

There are two initial use cases for this protocol.

The primary use case is to prevent on-path network devices from interfering with DNS operations. This interference includes, but is not limited to, spoofing DNS responses, blocking DNS requests, and tracking.

In this use, clients -- whether operating systems or individual applications -- will be explicitly configured to use a DOH server as a recursive resolver by its user (or administrator). They might use the DOH server for all queries, or only for a subset of them. The specific configuration mechanism is out of scope for this document.

A secondary use case is allowing web applications to access DNS information, by using existing APIs in browsers to access it over HTTP.

This is technically already possible (since the server controls both the HTTP resources it exposes and the use of browser APIs by its content), but standardisation might make this easier to accomplish.

Note that in this second use, the browser does not consult the DOH server or use its responses for any DNS lookups outside the scope of the application using them; i.e., there is (currently) no API that allows a Web site to poison DNS for others.
"""

Only SHOULD on matching HTTP expiration to TTL

The draft says "the HTTP cache headers SHOULD be set to expire at the same time as the shortest DNS TTL in the response", which might not be strong enough. A SHOULD might be OK for shorter, but allowing the value to increase might be unwise, depending on the caching model.

bootstrapping / circular dependency in hostname resolution?

If DOH uses https under the hood, how does this https host name get resolved? I would have expected this to be mentioned in the RFC, though I didn't see any mention of it, besides OCSP Stapling.

Is it assumed that DOH would need to be bootstrapped by traditional DNS?

Or is it assumed that a DOH endpoint would have a TLS/SSL certificate for its ip address? (Apparently a certificate for an IP address is possible in some cases, though hard to get.)

I assume DOH is not intended to be used by home NAT routers? (How would the router get a signed TLS certificate? Maybe a DHCP response could say: use this DOH url and trust this self-signed cert for its ip address (though a certificate might be too big for a DHCP response). Or, in cases where no local DNS services are needed, the DHCP response could maybe say: use this public DOH url, here's the ip address for that hostname)

Examples in 5.2 violate SHOULD in section 5

Section 5 states:

" In order to maximize cache friendliness, DNS API clients using media
formats that include DNS ID, such as application/dns-udpwireformat,
SHOULD use a DNS ID of 0 in every DNS request."

Yet the two examples given in section 5.2 have IDs of 0xabcd. I get that it's just a SHOULD, but wouldn't it be better if the examples followed it?

clients MUST process dns-updwireformat?

I think this may have been discussed during the DOH working group, but it conflicted with SAAG so I may have missed it if it was...

But I think it's insane to state that a client MUST support the binary DNS protocol. The likely use cases of this protocol will be javascript or other simple (cli curl) clients making requests and I don't think these clients should be required to support a binary protocol over a transport that has traditionally helped clients adapt to their needs. I understand requiring the server to support it, but IMHO, we should change this MUST to a SHOULD at most.

The "Use Cases" section is confusing

Some people reading the document thought that it only applied to the two use cases given, and could not be used for other things. It might be good to just pare it down to its essentials and move it as a sub-section of the Introduction. See PR #76 .