class: center, middle # HTTP & REST API API Design Best Practices & Hypertext Transfer Protocol (HTTP) --- ### Agenda The majority of concepts presented apply to all APIs that utilize HTTP, not just RESTful APIs. -- 1. What is HTTP & REST API? 1. Response & Requests 1. Query examples 1. Implementation details 1. Problems / other options --- ### Install #### API Testing Tool [Postman](https://www.postman.com/), [Insomnia]((https://insomnia.rest/), [Yaak](https://yaak.app/) etc. Import _Fake JSON REST API_: [URL](https://raw.githubusercontent.com/ttu/dotnet-fake-json-server/master/docs/FakeServer_Workspace.json) Import _GitHub V3_: [URL](https://raw.githubusercontent.com/ttu/rest-api-presentation/main/github-workspace.json) #### Fake JSON REST API ```sh git clone https://github.com/ttu/dotnet-fake-json-server.git cd dotnet-fake-json-server docker build -t fakeapi . docker run -it -p 57602:57602 --name fakeapi fakeapi ``` Open [http://localhost:57602/swagger/](http://localhost:57602/swagger/) --- ### Hypertext Transfer Protocol (HTTP) An application protocol designed to transfer information between networked devices. Runs on top of other layers of the network protocol stack. -- HTTP defines: * How messages are formed and transmitted * What actions web servers and browsers should take in response to various commands ??? * Information in this presentation, is not just REST specific, but how HTTP should be used with REST * https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview --- ### Application Programming Interface (API) __API__ is a set of definitions and protocols to build and integrate application software -- User _Interface_ (UI), for humans Application Programming _Interface_ (API), for programs -- Different APIs: * Web API (HTTP) * Remote API * Libraries * Frameworks * OS --- ### REST __RE__presentational __S__tate __T__ransfer * Refers to a style of web architecture * Resources living each on its own URL * Manipulated exclusively through HTTP-protocol * Stateless * Based on documents ??? * Refers to a style of web architecture that has many underlying characteristics and governs the behavior of clients and servers * Connection is stateless and all the information necessary for the server to understand the request is in the request * Documents are a simple model of communication --- ### REST API A REST API defines a set of functions in which developers can perform requests and receive responses via HTTP protocol. -- If you implement a REST API, any developer who will use it should be familiar with how to interact with your API. ??? * REST API is a specification how API should be organized and of how it should work --- ### HTTP Request & Response To view the communication between the client and server, execute the curl command or examine the request timeline in API client. ```sh curl https://api.github.com/users/ttu --verbose ``` 1. Connection Establishment 2. Request and Response 3. Additional Information ```sh * Preparing request to https://api.github.com/users/ttu * Current time is 2023-12-06T06:55:05.977Z * ... > GET /users/ttu HTTP/2 > Host: api.github.com > < HTTP/2 200 < server: GitHub.com < date: Wed, 06 Dec 2023 06:46:57 GMT < ... { "login": "ttu", "node_id": "MDQ6VXNlcjU4OTc4Ng==", ... * Received 250 B chunk * ... ``` --- ### HTTP Request * Method * Endpoint (route) * Headers * Body -- ```json [Method] [Endpoint] [HTTP-Version] [Headers] [Body] ``` -- ```txt POST /api/users HTTP2 Host: https://api.example.com Authorization: Bearer AbCdEf123456 CONTENT-TYPE: application/json { 'name': 'James', 'age': 40 } ``` ??? * With HTTP 1.x everything is plain text --- ### HTTP Methods HTTP defines a set of request methods to indicate the desired action to be performed for a given resource. -- * __GET__ * Fetch resource * __POST__ * Create resource * __PUT__ * Replace resource * __DELETE__ * Delete resource * __PATCH__ * Update resource -- * __HEAD__ * Get the metadata and headers without receiving a response body * __OPTIONS__ * Return allowed methods ??? * PATCH: * Rarely used for updates * With POST can have some race conditions * HEAD: https://github.com/ttu/dotnet-fake-json-server#head-method * Get count of items, check if the item exists without downloading the body * Rarely used * OPTIONS: https://github.com/ttu/dotnet-fake-json-server#options-method * Rarely used * Request method can be * Request should only perform desired action * Safe * Won't make changes to any data (except logging etc.) * Cacheable * Idempotent * making multiple identical requests has the same effect as making a single request * GET, PUT, DELETE, PATCH * POST creates new resources with each call --- ### Endpoint / Url The endpoint is one end of a communication channel. Often this would be represented as a URL of a server or service. __Root-endpoint__: `https://localhost:5678/api` __Path__: `/:collection/:id` __Query parameters__: `?query1=value1&query2=value2` -- ---- #### Example Get all users: ```txt http://localhost:5678/api/users/ ``` -- Get users with id 1: ```txt http://localhost:5678/api/users/1 ``` -- Get users from Helsinki: ```txt http://localhost:5678/api/users/?city=helsinki ``` ??? * Protocol: http(s) * Hostname: localhost * Port: 5678 * Path defines the requested resource * `city=helsinki` is just a convention to defined queries. --- ### Functions as Endpoints (Not the REST-Way) * Get companies * `/get-companies` * Add new company * `/add-new-company` * `body: new company data` * Get company employees * `/get-company-employees` * `body: companyId` * Get single employee from the company * `/get-single-company-employee` * `body: companyId, employeeId` ??? * Function as endpoints = remote procedure calls * Name of the endpoint defines the action * POST method should be used with functions as parameters are in body * Often returns 200 on success and on errors --- ### REST-Way The URL should only contain resources (nouns) not actions or verbs. HTTP methods define the action. -- * Get companies * `GET /companies` * Add new company * `POST /companies` * `body: new company data` * Update company * `PUT /companies/:companyId` * `body: company data` * Get company employees * `GET /companies/:companyId/employees` * Get single employee from the company * `GET /companies/:companyId/employees/:employeeId` --- ### Endpoint Examples from GitHub __root-endpoint__: `https://api.github.com/` https://developer.github.com/v3/users/ * `/users/:username` * `/users/:username/repos` * `/users/:username/repos/:repoid` https://developer.github.com/v3/repos/#list-all-public-repositories * `/repositories` * `/repositories/:id` https://developer.github.com/v3/repos/#get * `/repos/:owner/` * `/repos/:owner/:repo/pulls` * `/repos/:owner/:repo/pulls/:number` ??? NOTE: Show with API client --- ### HTTP Headers HTTP Headers are property-value pairs that are separated by a colon. ----- ```sh $ curl -H "Accept: application/json" https://api.github.com/users --verbose ``` ```json > GET /users HTTP/2 > Host: api.github.com > user-agent: curl/7.68.0 > accept: application/json ... ``` ??? * > is request * < is response --- * __Authorization__ - contains user authorization information. ```sh Authorizaation: Bearer azasdasdasf....asfasfasfsa ``` * __Accept__ - client tells what kind of data is requested. ```sh ACCEPT: application/json ``` * __Content-Type__ - in the request client tells the type of the body. In response, server tells the type of the response body. ```sh CONTENT-TYPE: application/vnd.com.example.customer+xml ``` * __Allow__ - OPTIONS method return list of allowed methods. ```sh Allow: GET POST ``` * __Last-Modified__ - server tells a time when it thinks that resource was modified. ```sh Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT ``` * And 100 others... https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers ??? MDN has one of the best web documentation https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types --- ### Body The body contains information you want to be sent to the server. -- JSON (JavaScript Object Notation) is a common format for sending and requesting data through a REST API. Body is only used with __POST__, __PUT__, __PATCH__ or __DELETE__ requests. ```json { "property1": "value1", "property2": "value2", "property3": "value3" } ``` ??? * Other: structured form-data, xml, binary --- ### Query operations * __Query__: * `GET /companies` * __Sorting__: * `GET /companies/1/employees?sort=salary_asc` * __Filter__: * `GET /companies?country=finland&domain=2` * __Filter operators__: * `GET /companies?revenue_gt=20000000` * __Slicing / Paging operators__: * `GET /companies?page=7&pageSize=20` * __Select fields__: * `GET /companies?fields=familyName,address` -- * Links * https://github.com/ttu/dotnet-fake-json-server#sort * https://github.com/ttu/dotnet-fake-json-server#filter * https://github.com/ttu/dotnet-fake-json-server#filter-operators * https://github.com/ttu/dotnet-fake-json-server#slice * https://github.com/ttu/dotnet-fake-json-server#select-fields ??? * No standards * Show from GitHub * _asc = ascending, _lt = less than, _gt = greater than * Paging is sometimes done with DB "hash key", which is id to the last item in the page --- ### HTTP Response * Headers * Return code * Body ```json [Status-Line] [Headers] [Body] ``` --- ### Status Line & Headers __Staus line__: HTTP-Version SP Status-Code SP Reason-Phrase CRLF __Headers__: Property-value pairs that are separated by a colon ```json < HTTP/1.1 200 OK < Server: GitHub.com < Date: Fri, 09 Feb 2018 05:59:24 GMT < Content-Type: application/json; charset=utf-8 < Content-Length: 2165 < Status: 200 OK < X-RateLimit-Limit: 60 < X-RateLimit-Remaining: 59 < X-RateLimit-Reset: 1518159564 < Cache-Control: public, max-age=60, s-maxage=60 < Vary: Accept < ETag: "1f1633719eaf8a94c8e4fb324cee3058" < X-GitHub-Media-Type: github.v3 < Access-Control-Expose-Headers: ETag, Link, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval < Access-Control-Allow-Origin: * < Content-Security-Policy: default-src 'none' < Strict-Transport-Security: max-age=31536000; includeSubdomains; preload < X-Content-Type-Options: nosniff < X-Frame-Options: deny < X-XSS-Protection: 1; mode=block < X-Runtime-rack: 0.007814 < X-GitHub-Request-Id: 05FE:2194:3B8CCF:6F44E3:5A7D38BA ``` ??? * https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ * __X-RateLimit-Limit__: request limit * __Etag__: an identifier for a specific version of a resource --- ### HTTP Return Codes #### 200 Success * 200 OK * 201 Created * 202 Accepted * 204 No Content -- #### 300 Redirect * 304 Not Modified ??? * Created might have new id in response and location in headers * Accepted mainly used with async functionality --- ### HTTP Return Codes #### 400 Client Error * 400 Bad Request * 401 Unauthorized * 403 Forbidden * 404 Not Found * 405 Not Allowed * 409 Conflict * 422 Unprocessable Entity -- #### 500 Server Error * 500 Internal Server Error * 503 Service Unavailable ??? * 301 Moved not used with REST(?) * 303 See Other used when the async job is ready and the response has a location header to a new resource * 4xx codes are used to tell the client that a fault has taken place on THEIR side. They should not re-transmit the same request again but fix the error first. * 404 Not Found - can be from the wrong URL or server return if data is not found * 405 Not Allowed - method not allowed for endpoint * 409 Conflict - conflict with current state, concurrent requests, data already exists, already modified * 422 Unprocessable Entity - Data is valid and server understand it, but can't process it for some reaso --- ### Correct Codes for Methods * `GET /api/:collection/:id` ``` 200 OK : Item is found 400 Bad Request : Collection is not found 404 Not Found : Item is not found ``` -- * `POST /api/:collection` ``` 201 Created : New item is created 400 Bad Request : New item is null 409 Conflict : Resource already exists 422 Unprocessable Entity : Invalid data, business rule violation ``` -- * `PUT /api/:collection/:id` ``` 204 No Content : Item is replaced 400 Bad Request : Item is null 404 Not Found : Item is not found 409 Conflict : Updated with conflicting data 422 Unprocessable Entity : Invalid data, business rule violation ``` ??? * 422 vs 400: request is syntatically correct, bus server is unable to process it --- ### Correct Codes for Methods * `PATCH /api/:collection/:id` ``` 204 No Content : Item updated 400 Bad Request : PATCH is empty / not valid 404 Not Found : Item is not found 409 Conflict : Updated with conflicting data 422 Unprocessable Entity : Invalid data, business rule violation ``` -- * `DELETE /api/:collection/:id` ``` 204 No Content : Item deleted 404 Not Found : Item is not found ``` --- ### REST API Maturity Level Richardson Maturity Model 0. HTTP 1. Resources 2. HTTP Verbs 3. HATEOAS https://martinfowler.com/articles/richardsonMaturityModel.html ??? * https://blog.restcase.com/4-maturity-levels-of-rest-api-design/ * Some might add versioning to the list --- ### HATEOAS __H__ypermedia __a__s __t__he __E__ngine __o__f __A__pplication __S__tate is a principle that hypertext links should be used to create a better navigation through the API. ```json { "id": 711, "manufacturer": "bmw", "model": "X5", "seats": 5, "drivers": [ { "id": "23", "name": "Stefan Jauker", "links": [ { "rel": "self", "href": "/api/v1/drivers/23" }, { "rel": "address", "href": "/api/v1/drivers/23/address" } ] } ] } ``` ??? * Client doesn't need to know how to interact with an application for different actions, as all the metadata will be embedded in responses from the server. --- ### Next Topics 1. Api Versioning 1. More on PATCH method 1. Safe methods 1. Idempotency 1. Caching & Mid-air collisions 1. GET & body 1. Errors & problem details 1. API Security 1. Problem Details --- ### API Versioning ##### No Versioning - Common to start without versioning - Adding and removing fields from return value might not require versioning if the client can handle it - Bigger changes require a new version -- ##### Uri ```json https://developer.github.com/v2/repos/ https://developer.github.com/v3/repos/ ``` -- ##### Querystring ```json https://developer.github.com/repos/?version=2 https://developer.github.com/repos/?version=3 ``` -- ##### Header ```json GET https://developer.github.com/repos/ HTTP/1.1 Custom-Header: api-version=3 ``` ??? * Github examples have v3 in url --- ### PATCH Partial updates for resources. Implementation is often complex, so PUT is used in simple APIs. Different patch formats: #### [Merge Patch](https://datatracker.ietf.org/doc/html/rfc7396) ```json { "baz":"bar", "hello: { "foo": null } } ``` #### [JSON Patch](http://jsonpatch.com/) ```json [ { "op": "replace", "path": "/baz", "value": "boo" }, { "op": "add", "path": "/hello", "value": ["world"] }, { "op": "remove", "path": "/foo" } ] ``` ??? * If a library supports PATCH, then easy to take into use --- ### Safe Methods The method is safe if it doesn't alter the state of the server (read-only). -- Common __safe methods:__ GET, HEAD, OPTIONS Common __unsafe methods:__ PUT, DELETE, POST -- * The client doesn't request any server change itself, and therefore won't create an unnecessary load or burden for the server. * Browsers can call safe methods without fearing causing any harm to the server. This allows them to perform activities like pre-fetching without risk. -- GET methods shouldn't cause too much load on the server as the requester might assume that it is safe to make multiple calls. --- ### Idempotency The method is __idempotent__ if an __identical__ request can be made __once or several times__ in a row with the same effect while leaving the server in the same state. -- An idempotent method should not have any side effects (except for updating statistics/logs). __IDEMPOTENT:__ GET, HEAD, PUT, PATCH, DELETE __NOT:__ POST -- To be idempotent, only the actual back-end state of the server is considered. The status code returned by each request may differ: * PATCH / DELETE * The first call of a will likely return a 200 * Successive ones will likely return a 404 or error ??? * Idempotent if implemented correctly. Never trust. * POST is not as it will create a new record on each request * Most PATCH operations always have the same response, some can return an error on successive calls (e.g. move) --- ### Caching & Mid-Air Collisions __Caching__ saves server for sending payload if the client has already most recent version __Mid-air collisions__ prevent updating data that has already been changed -- Both use __The ETag (or entity tag)__ HTTP response header -- * __ETag__ * Identifier for a specific version of a resource * Hash of resource data -- * __GET__ * Verify that data has not changed with the `If-None-Match` header * Data has not changed: 304 Not Modified -- * __PUT__ * Verify that data is still the same as the client's original data `If-Match` header * If data is not the same: 412 Precondition failed ??? * Show from https://api.github.com/users/ttu * https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-None-Match * These are not so commonly used * For ETag GET saves the server from sending large payloads * PUT is rarely used. The last update is always fine. --- ### GET & Body GET requests should not have a body. * Server might ignore the body * Include proxies, load balancers etc. * GET requests are often cached * Body is not part of the cache key * Body might not be cached at all * Clients do not support sending body with GET -- Use POST-method. Name the endpoint as a function or add good description. -- Coming at some point: [QUERY-method](https://httpwg.org/http-extensions/draft-ietf-httpbis-safe-method-w-body.html) --- ### Errors & Problem Details There are many ways in how APIs return errors. Check from the service's API documentation how the service implements errors. E.g. GitHub's response for `404` ```json { "message": "Not Found", "documentation_url": "https://docs.github.com/rest" } ``` -- One recommendation is to use [Problem Details](https://tools.ietf.org/html/rfc7807) when returning errors. ```json { type: string title: string description: string } ``` ??? * Problem details are standardized way. Less to decide with the team. --- ### Problem Details ```json HTTP/1.1 403 Forbidden Content-Type: application/problem+json Content-Language: en { "type": "https://example.com/probs/out-of-credit", "title": "You do not have enough credit.", "detail": "Your current balance is 30, but that costs 50.", "instance": "/account/12345/msgs/abc", "balance": 30, "accounts": ["/account/12345","/account/67890"], "status": 403 } ``` Details can have also any number of additional fields. These should be always the same for each type. If API is designed to communicate with the front-end and the user locale is known, then the title and description should be localized for the client. --- ### API Security - Authentication * API KEY * Basic Auth * Token ```sh $ curl -H 'X-API-KEY: abcd1234' http://localhost:57602/api $ curl -u admin:root http://localhost:57602/api $ curl -H 'Authorization: Basic YWRtaW46cm9vdA==' http://localhost:57602/api $ curl -H 'Authorization: Bearer AbCdEf123456' http://localhost:57602/api ``` How to fetch token example: https://github.com/ttu/dotnet-fake-json-server#token-authentication ??? * It is not recommended to use Basic Authentication in production as base64 is a reversible encoding --- ### API Security - Protecting Endpoints __Caching__ * Cache all safe (GET) requests for reduced latency and improved performance * Non-customer specific requests for extended durations to minimize frequent database access * Customer-specific requests for shorter durations to ensure up-to-date data * Prioritize caching closer to the customer for optimized delivery times * ```(CDN > load balancer > endpoint > BL)``` -- __Duplicate identical request detection and prevention__ * Prevent users from overwhelming the service with repetitive requests that consume unnecessary resources * Implement request deduplication techniques to identify and handle identical requests effectively -- __Idempotency__ * Employ idempotency mechanisms to ensure that multiple requests for the same action are treated as a single transaction --- ### API Security - Protecting Endpoints __Rate limiting__ * Establish rate limits to regulate the frequency at which users can access API resources * Define appropriate rate limits based on user roles, access patterns, and resource requirements -- __Fault tolerance__ * Enhance the system's resilience to unexpected errors and unexpected conditions * Implement error handling mechanisms to gracefully handle unhandled exceptions without causing system crashes * Introduce timeouts for requests to external systems to prevent cascading failures -- __Configure Cross-origin resource sharing (CORS)__ * Configure Cross-Origin Resource Sharing (CORS) to specify which websites or domains are authorized to access the API ??? * Cache all safe requests * Cache as close to the customer as possible, e.g. CDN, load balancer, service/endpoint, business logic * Idempotency: Duplicate requests do not leave the system unstable. * Rate limit requests from same IP * Prevent similar POST requests, e.g. creation of new contact requests with same email etc. * Prevent requests on CDN, etc. --- ### Other APIs - Request / Response 1. SOAP * SOAP is an XML message format used in Web service interactions. SOAP messages are typically sent over HTTP or JMS, but other transport protocols can be used. The use of SOAP in a specific Web service is described by a WSDL definition. 1. RPC * RPC (Remote procedure call) is like a remote function call * JSON/XML/HTTP RPC * e.g. https://api.slack.com/web 1. API Query Layers * OData * GraphQL 1. Hypermedia Media Types * JSON API (application/vnd.api+json) * http://jsonapi.org/examples/ * Siren (application/vnd.siren+json) ??? * Frameworks that create CRUD for DB * https://postgrest.org/en/stable/ * Hasura, POstgraphile * https://www.moesif.com/blog/graphql/technical/Ways-To-Add-GraphQL-To-Your-Postgres-Database-Comparing-Hasura-Prisma-and-Others/ --- ### Other APIs - Event 1. WebHooks * Incoming Webhooks are a simple way to post messages from external sources into the service * Outgoing webhooks will send an HTTP POST request to your specified URL 1. Websub * Publish/Subscripbe model 1. Server Sent Events (SSE) * One-way streaming APIs can be used to send regular, and sustained updates, which can be more efficient than regular polling of an API 1. Websockets * Two-way API streams that require data to be both sent and received, providing full-duplex communication channels over a single TCP connection --- ### Other APIs - Other 1. Kafka / RabbitMQ / ESB / MessageBus * Message Buses can also act as APIs 1. Message Queuing Telemetry Transport (MQTT), ZeroMQ, gRPC (ProtoBuf) * https://grpc.io/ --- ### Problems with REST * Standards are followed poorly * No precise specifications * Just a vague "RESTful philosophy" * Prone to endless debates * Many ways to achieve almost correct solutions * REST is often a thin wrapper around database * Business logic is shifted to client -- ### Reality REST APIs are often not actual REST APIs, but just requests over HTTP + JSON with REST-like endpoint structure --- ### Problems with REST #### API should model business processes Modeling can be hard with REST. Mixed resource names and actions make sometimes more sense. ```sh # Get contracts with id GET /api/contracts/{contract_id} # Get contracts with data from other resources GET /api/contracts/contracts-with-all-required-data-bla-bla # Create new contract POST /api/contracts/ # Create report of terminated contract # Doesn't support other formats than csv GET /api/contracts/terminated-contracts-report-csv ``` --- ### Problems with REST #### For actions, the resource model is not always the most understandable way to present what is wanted to perform ##### Shipment status notification * PUT `/api/shipments/{id}` * Update resource's status_notification * POST `/api/shipments/{id}/status-notifications` * Create new status_notification resource -- ##### Cancel a task * PUT `/api/tasks/{id}` * Set active-property from the task resource to false * DELETE `/api/tasks/{id}` * Delete the task resource --
Maybe better to use functions over HTTP (Remote Procedure Calls)? `/api/send-shipment-notification` and `/api/cancel-task` ??? * Have to know what actions will happen when the resource is updated --- ### Problems with REST #### JSON There are only two times when JSON is a particularly good choice: -- * Messages to be human-readable -- * Consume it from JavaScript without using a library -- JSON is not an important format for internal services -- #### HTTP HTTP is maybe not the best protocol for modern days * 30-year-old protocol that was only meant to be used for transferring hypermedia documents --
Ditch REST, HTTP and JSON and forget inter-developer REST holy wars Choose e.g. [gRPC](hhttps://grpc.io/docs/what-is-grpc/introduction/) or just use simple functions as endpoints (RPC). ??? * No standards -> lots of opinions -> war --- ### References * https://medium.freecodecamp.org/rest-is-the-new-soap-97ff6c09896d * http://habitatchronicles.com/2017/11/a-slightly-skeptical-perspective-on-rest/ * https://blog.refineri.co.uk/the-good-way-to-rest-part-1-909566ae790 * https://apievangelist.com/2018/02/03/api-is-not-just-rest/ * https://hackernoon.com/restful-api-designing-guidelines-the-best-practices-60e1d954e7c9 * https://developer.github.com/v3/