From 2fff96311b5a740c23f1a2e1ddadd6127459ba24 Mon Sep 17 00:00:00 2001 From: Andrew Stanton-Nurse Date: Thu, 2 Feb 2017 14:41:38 -0800 Subject: [PATCH] tidying up the transport spec (#171) --- specs/TransportProtocols.md | 67 +++++++++++++++++++++++++++---------- 1 file changed, 50 insertions(+), 17 deletions(-) diff --git a/specs/TransportProtocols.md b/specs/TransportProtocols.md index fb3edfba4d..2994e9718c 100644 --- a/specs/TransportProtocols.md +++ b/specs/TransportProtocols.md @@ -20,26 +20,34 @@ The only transport which fully implements the duplex requirement is WebSockets, Throughout this document, the term `[endpoint-base]` is used to refer to the route assigned to a particular end point. The term `[connection-id]` is used to refer to the connection ID provided by the `[endpoint-base]/negotiate` end point. +**NOTE on errors:** In all error cases, by default, the detailed exception message is **never** provided; a short description string may be provided. However, an application developer may elect to allow detailed exception messages to be emitted, which should only be used in the `Development` environment. Unexpected errors are communicated by HTTP `500 Server Error` status codes or WebSockets `1008 Policy Violation` close frames; in these cases the connection should be considered to be terminated. + ## WebSockets (Full Duplex) The WebSockets transport is unique in that it is full duplex, and a persistent connection that can be established in a single operation. As a result, the client is not required to use the `[endpoint-base]/negotiate` endpoint to establish a connection in advance. It also includes all the necessary metadata in it's own frame metadata. -The WebSocket transport is activated by making a WebSocket connection to `[endpoint-base]/ws`. Upon doing so, the connection is fully established and immediately ready for frames to be sent/received. The WebSocket OpCode field is used to indicate the type of the frame (Text or Binary) and the WebSocket "FIN" flag is used to indicate the end of a message. +The WebSocket transport is activated by making a WebSocket connection to `[endpoint-base]/ws`. The **optional** `connectionId` query string value is used to identify the connection to attach to. If there is no `connectionId` query string value, a new connection is established. If the parameter is specified but there is no connection with the specified ID value, a `404 Not Found` response is returned. Upon receiving this request, the connection is established and the server responds with a WebSocket upgrade (`101 Switching Protocols`) immediately ready for frames to be sent/received. The WebSocket OpCode field is used to indicate the type of the frame (Text or Binary) and the WebSocket "FIN" flag is used to indicate the end of a message. Establishing a second WebSocket connection when there is already a WebSocket connection associated with the Endpoints connection is not permitted and will fail with a `409 Conflict` status code. +Errors while establishing the connection are handled by returning a `500 Server Error` status code as the response to the upgrade request. This includes errors initializing EndPoint types. Unhandled application errors trigger a WebSocket `Close` frame with reason code that matches the error as per the spec (for errors like messages being too large, or invalid UTF-8). For other unexpected errors during the connection, the `1008 Policy Violation` status code is used. + ## HTTP Post (Client-to-Server only) HTTP Post is a half-transport, it is only able to send messages from the Client to the Server, as such it is always used with one of the other half-transports which can send from Server to Client (Server Sent Events and Long Polling). -This transport uses HTTP Post requests to a specific end point to encode messages. Each HTTP Request represents a single message (this achieves the Frame-based requirement), contains raw binary data (Binary-safe) and uses the query string to encode the metadata. - This transport requires that a connection be established using the `[endpoint-base]/negotiate` end point. -The HTTP POST request is made to the URL `[endpoint-base]/[connection-id]/send`. The content consists of frames in the same format as the Long Polling transport. It is up to the client which of the Text or Binary protocol they use, and the server is able to detect which they use either via the `Content-Type` (see the Long Polling transport section) or via the first byte (`T` for the Text-based protocol, `B` for the binary protocol). Upon receipt of the **entire** request, the server will process and deliver all the messages, responding with `202 Accepted` if all the messages are successfully processed. If a client makes another request to `/send` while an existing one is outstanding, the existing request is terminated by the server with the `409 Conflict` status code. +The HTTP POST request is made to the URL `[endpoint-base]/send`. The **mandatory** `connectionId` query string value is used to identify the connection to send to. If there is no `connectionId` query string value, a `400 Bad Request` response is returned. The content consists of frames in the same format as the Long Polling transport. It is up to the client which of the Text or Binary protocol they use, and the server is able to detect which they use either via the `Content-Type` (see the Long Polling transport section) or via the first byte (`T` for the Text-based protocol, `B` for the binary protocol). Upon receipt of the **entire** request, the server will process and deliver all the messages, responding with `202 Accepted` if all the messages are successfully processed. If a client makes another request to `/send` while an existing one is outstanding, the new request is immediately terminated by the server with the `409 Conflict` status code. + +If the client transmits a `Close` or `Error` frame, the server will ignore and discard any frames following that frame and immediately return `202 Accepted`. Any further attempts to send to that connection will receive a `404 Not Found` response, as the connection will have been terminated. + +If a client receives either a `202 Accepted` or `409 Conflict` request, the connection remains open. Any other response indicates that the connection has been terminated due to an error. This transport will always produce single-frame messages, since the Long-Polling protocol below does not support encoding an end-of-message flag. +If the relevant connection has been terminated, a `404 Not Found` status code is returned. If there is an error instantiating an EndPoint or dispatching the message, a `500 Server Error` status code is returned. + ## Server-Sent Events (Server-to-Client only) Server-Sent Events (SSE) is a protocol specified by WHATWG at [https://html.spec.whatwg.org/multipage/comms.html#server-sent-events](https://html.spec.whatwg.org/multipage/comms.html#server-sent-events). It is capable of sending data from server to client only, so it must be paired with the HTTP Post transport. It also requires a connection already be established using the `[endpoint-base]/negotiate` endpoint. @@ -60,20 +68,25 @@ foo: boz In the first event, the value of `baz` would be `boz\nbiz\nflarg`, due to the concatenation behavior above. Full details can be found in the spec linked above. -In this transport, the client establishes an SSE connection to `[endpoint-base]/sse`, and the server responds with an HTTP response with a `Content-Type` of `text/event-stream`. Each SSE event represents a single frame from client to server. The transport uses unnamed events, which means only the `data` field is available. Thus we use the first line of the `data` field for frame metadata. The frame body starts on the **second** line of the `data` field value. The first line has the following format (Identifiers in square brackets `[]` indicate fields defined below): +In this transport, the client establishes an SSE connection to `[endpoint-base]/[connection-id]/sse`, and the server responds with an HTTP response with a `Content-Type` of `text/event-stream`. The **mandatory** `connectionId` query string value is used to identify the connection to send to. If there is no `connectionId` query string value, a `400 Bad Request` response is returned, if there is no connection with the specified ID, a `404 Not Found` response is returned. Each SSE event represents a single frame from client to server. The transport uses unnamed events, which means only the `data` field is available. Thus we use the first line of the `data` field for frame metadata. The frame body starts on the **second** line of the `data` field value. The first line has the following format (Identifiers in square brackets `[]` indicate fields defined below): ``` [Type] ``` -* `[Type]` is a single UTF-8 character representing the type of the frame; `T` indicates Text, and `B` indicates Binary. +* `[Type]` is a single UTF-8 character representing the type of the frame; `T` indicates Text, `B` indicates Binary, `E` indicates an error, `C` indicates that the server is terminating its end of the connection. -If the `[Type]` field is `T`, the remaining lines of the `data` field contain the value, in UTF-8 text. If the `[Type]` field is `B`, the remaining lines of the `data` field contain Base64-encoded binary data. Any `\n` characters in Binary frames are removed before Base64-decoding. However, servers should avoid line breaks in the Base64-encoded data. +If the `[Type]` field is `T`, the remaining lines of the `data` field contain the value, in UTF-8 text. If the `[Type]` field is `B`, the remaining lines of the `data` field contain Base64-encoded binary data. Any `\n` characters in Binary frames are removed before Base64-decoding. However, servers should avoid line breaks in the Base64-encoded data. If the `[Type]` field is `E`, the remaining lines of the `data` field may contain a textual description of the error. Also, the connection will be terminated immediately following an `E` frame. If the `[Type]` field is `C`, there is no remaining content in the `data` field. + +If the SSE response ends without a `C` or `E` frame, the client should immediately attempt to reestablish the connection. However, the protocol provides **no** guarantees that any messages that have not been seen will be retransmitted. + +TBD: Keep Alive - Should it be done at this level? For example, when sending the following frames (`\n` indicates the actual Line Feed character, not an escape sequence): * Type=`Text`, "Hello\nWorld" * Type=`Binary`, `0x01 0x02` +* Type=`Close` The encoding will be as follows @@ -85,6 +98,9 @@ data: World data: B data: AQI= +data: C + +[Connection Terminated at this point] ``` This transport will buffer incomplete frames sent by the server until the full message is available and then send the message in a single frame. @@ -99,11 +115,13 @@ Since there is such a long round-trip-time for messages, given that the client m A Poll is established by sending an HTTP GET request to `[endpoint-base]/poll` with the following query string parameters -* `id` (Required) - The Connection ID of the destination connection. +* `connectionId` (Required) - The Connection ID of the destination connection. * `supportsBinary` (Optional: default `false`) - A boolean indicating if the client supports raw binary data in responses When messages are available, the server responds with a body in one of the two formats below (depending upon the value of `supportsBinary`). The response may be chunked, as per the chunked encoding part of the HTTP spec. +If the `connectionId` parameter is missing, a `400 Bad Request` response is returned. If there is no connection with the ID specified in `connectionId`, a `404 Not Found` response is returned. + ### Text-based encoding (`supportsBinary` = `false` or not present) The body will be formatted as below and encoded in UTF-8. The `Content-Type` response header is set to `application/vnd.microsoft.aspnet.endpoint-messages.v1+text`. Identifiers in square brackets `[]` indicate fields defined below, and parenthesis `()` indicate grouping. @@ -113,19 +131,29 @@ T([Length]:[Type]:[Body];)([Length]:[Type]:[Body];)... continues until end of th ``` * `[Length]` - Length of the `[Body]` field in bytes, specified as UTF-8 digits (`0`-`9`, terminated by `:`). If the body is a binary frame, this length indicates the number of Base64-encoded characters, not the number of bytes in the final decoded message! -* `[Type]` - A single-byte UTF-8 character indicating the type of the frame, `T` indicates Text and `B` indicates Binary (case-sensitive) -* `[Body]` - The body of the message. If `[Type]` is `T`, this is just the sequence of UTF-8 characters for the text frame. If `[Type]` is `B`, the frame is Base64-encoded, so `[Body]` must be Base64-decoded to get the actual frame content. +* `[Type]` - A single-byte UTF-8 character indicating the type of the frame, see the list of frame Types below +* `[Body]` - The body of the message, the content of which depends upon the value of `[Type]` + +The following values are valid for `[Type]`: + +* `T` - Indicates a text frame, the `[Body]` contains UTF-8 encoded text data. +* `B` - Indicates a text frame, the `[Body]` contains Base64 encoded binary data. +* `E` - Indicates an error frame, the `[Body]` contains an optional UTF-8 encoded error description. The connection is immediately closed after processing this frame. +* `C` - Indicates a close frame, there is no `[Body]`. The connection is immediately closed after processing this frame. + +Note: If there is no `[Body]` for a frame, there does still need to be a `:` and `;` delimiting the body. So, for example, the following is an encoding of a single text frame `A` followed by a single close frame: `T1:T:A;0:C:;` For example, when sending the following frames (`\n` indicates the actual Line Feed character, not an escape sequence): * Type=`Text`, "Hello\nWorld" * Type=`Binary`, `0x01 0x02` +* Type=`Close` The encoding will be as follows ``` T11:T:Hello -World;4:B:AQI=; +World;4:B:AQI=;0:C:; ``` Note that the final frame still ends with the `;` terminator, and that since the body may contain `;`, newlines, etc., the length is specified in order to know exactly where the body ends. @@ -143,26 +171,31 @@ B([Length][Type][Body])([Length][Type][Fin][Body])... continues until end of the ``` * `[Length]` - A 64-bit integer in Network Byte Order (Big-endian) representing the length of the body in bytes -* `[TypeAndFin]` - An 8-bit integer indicating the type of the message. - * `0x00` => `Text` - * `0x01` => `Binary` +* `[Type]` - An 8-bit integer indicating the type of the message. + * `0x00` => `Text` - `[Body]` is UTF-8 encoded text data + * `0x01` => `Binary` - `[Body]` is raw binary data + * `0x02` => `Error` - `[Body]` is UTF-8 encoded error message + * `0x03` => `Close` - `[Body]` is empty * All other values are reserved and must **not** be used. An endpoint may reject a frame using any other value and terminate the connection. -* `[Body]` - The body of the message, exactly `[Length]` bytes in length. Text frames are always encoded in UTF-8. +* `[Body]` - The body of the message, exactly `[Length]` bytes in length. `Text` and `Error` frames are always encoded in UTF-8. For example, when sending the following frames (`\n` indicates the actual Line Feed character, not an escape sequence): * Type=`Text`, "Hello\nWorld" * Type=`Binary`, `0x01 0x02` +* Type=`Close` The encoding will be as follows, as a list of binary digits in hex (text in parentheses `()` are comments). Whitespace and newlines are irrelevant and for illustration only. ``` 0x66 (ASCII 'B') 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x0B (start of frame; 64-bit integer value: 11) -0x80 (Type = Text, Fin = True) +0x80 (Type = Text) 0x68 0x65 0x6C 0x6C 0x6F 0x0A 0x77 0x6F 0x72 0x6C 0x64 (UTF-8 encoding of 'Hello\nWorld') 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x02 (start of frame; 64-bit integer value: 2) -0x01 (Type = Binary, Fin = False) +0x01 (Type = Binary) 0x01 0x02 (body) +0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (start of frame; 64-bit integer value: 0) +0x03 (Type = Close) ``` This transport will buffer incomplete frames sent by the server until the full message is available and then send the message in a single frame.