CORS! #1

Merged
jkaufeld merged 9 commits from origins into main 2024-09-02 00:39:35 -04:00
11 changed files with 288 additions and 20 deletions
Showing only changes of commit 0cb645ce4e - Show all commits

View File

@ -8,13 +8,15 @@ This is `spiderweb`, a WSGI-compatible web framework that's just big enough to h
- Learn a lot
- Create an unholy blend of Django and Flask
- Not look at any existing code. Go off of vibes alone and try to solve all the problems I could think of in my own way
- Not look at any existing code[^1]. Go off of vibes alone and try to solve all the problems I could think of in my own way
> [!WARNING]
> This is a learning project. It should not be used for production without heavy auditing. It's not secure. It's not fast. It's not well-tested. It's not well-documented. It's not well-anything. It's a learning project.
>
> That being said, it's fun and it works, so I'm counting that as a win.
> [!TIP|style:flat]
> To jump in with both feet, [head over to the quickstart!](quickstart.md)
## Design & Usage Decisions
@ -90,6 +92,7 @@ Simply having these declared in a place that Django can find them is enough, and
Spiderweb takes a middle ground approach: it allows you to declare framework-first arguments on the SpiderwebRouter object, and if you need to pass along other data to other parts of the system (like custom middleware), you can do so by passing in any keyword argument you'd like to the constructor.
```python
from spiderweb import SpiderwebRouter
from peewee import SqliteDatabase
app = SpiderwebRouter(
@ -112,7 +115,6 @@ Here's a non-exhaustive list of things this can do:
- URLs with variables in them a lá Django
- Full middleware implementation
- Limit routes by HTTP verbs
- (Only GET and POST are implemented right now)
- Custom error routes
- Built-in dev server
- Gunicorn support
@ -120,13 +122,11 @@ Here's a non-exhaustive list of things this can do:
- Static files support
- Cookies (reading and setting)
- Optional append_slash (with automatic redirects!)
- ~~CSRF middleware implementation~~ (it's there, but it's crappy and unsafe. This might be beyond my skillset.)
- CSRF middleware
- CORS middleware
- Optional POST data validation middleware with Pydantic
- Database support (using Peewee, but you can use whatever you want as long as there's a Peewee driver for it)
- Session middleware with built-in session store
- Database support (using Peewee, but you can use whatever you want as long as there's a Peewee driver for it)
- Tests (currently a little over 80% coverage)
## What's left to build?
- Fix CSRF middleware
- Add more HTTP verbs
[^1]: I mostly succeeded. The way that I'm approaching this is that I did my level best, then looked at (and copied) existing solutions where necessary. At the time of this writing, I did all of it solo except for the CORS middleware. [Read more about it here.](middleware/cors.md)

View File

@ -3,7 +3,7 @@
> the web framework just big enough for a spider
[GitHub](https://github.com/itsthejoker/spiderweb/)
[Get Started](#spiderweb)
[Get Started](/README)
![color](#222)

View File

@ -5,5 +5,6 @@
- [overview](middleware/overview.md)
- [session](middleware/sessions.md)
- [csrf](middleware/csrf.md)
- [cors](middleware/cors.md)
- [pydantic](middleware/pydantic.md)
- [writing your own](middleware/custom_middleware.md)

View File

@ -10,3 +10,10 @@
> [!NOTE]
> An alert of type 'note' using global style 'callout'.
> [!TIP|style:flat|label:My own heading|iconVisibility:hidden]
> An alert of type 'tip' using alert specific style 'flat' which overrides global style 'callout'.
> In addition, this alert uses an own heading and hides specific icon.
> [!NOTE|icon:fa-solid fa-notes]
> A custom icon!

View File

@ -58,8 +58,11 @@
<script src="https://cdn.jsdelivr.net/npm/docsify-tabs@1"></script>
<!-- search -->
<script src="//cdn.jsdelivr.net/npm/docsify/lib/plugins/search.min.js"></script>
<!-- footnotes -->
<script src="//cdn.jsdelivr.net/npm/@sy-records/docsify-footnotes/lib/index.min.js"></script>
<!-- click to copy in code blocks -->
<script src="//cdn.jsdelivr.net/npm/docsify-copy-code/dist/docsify-copy-code.min.js"></script>
<script src="https://kit.fontawesome.com/940400877f.js" crossorigin="anonymous"></script>
<script defer data-domain="itsthejoker.github.io/spiderweb" src="https://plausible.io/js/script.js"></script>
</body>
</html>

177
docs/middleware/cors.md Normal file
View File

@ -0,0 +1,177 @@
# cors middleware
```python
from spiderweb import SpiderwebRouter
app = SpiderwebRouter(
middleware=["spiderweb.middleware.cors.CorsMiddleware"],
)
```
CORS, or Cross-Origin Resource Sharing, is an incredibly important piece of how different parts of the web communicate. As such, there is a CORS handler built into Spiderweb.
> [!TIP]
> The CorsMiddleware should be placed as high as possible in the middleware list, as it needs as much control as possible over requests and responses.
This implementation is lovingly ~~ripped~~ ~~lifted~~ borrowed from [Django CORS Headers](https://github.com/adamchainz/django-cors-headers/), an industry-standard implementation for handing CORS that has existed for over a decade. It is essentially and functionally the same. The below doc is ~~copy-and-pasted~~ also borrowed from Django CORS Headers, with updates where needed. (They just already do a great job of explaining these things.)
The available configurations are listed below, and you must set at least one of three following settings:
- `cors_allowed_origins`
- `cors_allowed_origin_regexes`
- `cors_allow_all_origins`
## cors_allowed_origins
A list of origins that are authorized to make cross-site HTTP requests. The origins in this setting will be allowed, and the requesting origin will be echoed back to the client in the access-control-allow-origin header. Defaults to `[]`.
An Origin is defined as a URI scheme + hostname + port, or one of the special values 'null' or 'file://'. Default ports (HTTPS = 443, HTTP = 80) are optional.
```python
app = SpiderwebRouter(
cors_allowed_origins=[
"https://example.com",
"https://sub.example.com",
"http://localhost:8080",
"http://127.0.0.1:9000",
]
)
```
## cors_allowed_origin_regexes
A list of strings representing regexes that match Origins that are authorized to make cross-site HTTP requests. Defaults to `[]`. Useful when `cors_allowed_origins` is impractical, such as when you have a large number of subdomains.
```python
app = SpiderwebRouter(
cors_allowed_origin_regexes=[
r"^https://\w+\.example\.com$",
]
)
```
## cors_allow_all_origins
If `True`, all origins will be allowed. Other settings restricting allowed origins will be ignored. Defaults to `False`.
Setting this to `True` can be _dangerous_, as it allows any website to make cross-origin requests to yours. Generally you'll want to restrict the list of allowed origins with `cors_allowed_origins` or `cors_allowed_origin_regexes`.
```python
app = SpiderwebRouter(
cors_allow_all_origins=True
)
```
# Optional settings
All the following settings have sensible defaults, but are available if you want to tweak them for your use case. For most cases, you'll just want to leave these alone.
## cors_urls_regex
A regex which restricts the URL's for which the CORS headers will be sent. Defaults to `r'^.*$'`, i.e. match all URL's. Useful when you only need CORS on a part of your site, e.g. an API at /api/.
```python
app = SpiderwebRouter(
cors_urls_regex=r"^/api/.*$"
)
```
## cors_allow_methods
A list of HTTP verbs that are allowed for the actual request. Defaults to:
```python
DEFAULT_CORS_ALLOW_METHODS = (
"DELETE",
"GET",
"OPTIONS",
"PATCH",
"POST",
"PUT",
)
```
The default can be imported from `spiderweb.constants` so you can just extend it with custom methods. This allows you to keep up to date with any future changes. For example:
```python
from spiderweb.constants import DEFAULT_CORS_ALLOW_METHODS as default_methods
app = SpiderwebRouter(
cors_allow_methods=(
*default_methods,
"POKE",
)
)
```
## cors_allow_headers
The list of non-standard HTTP headers that you permit in requests from the browser. Sets the `Access-Control-Allow-Headers` header in responses to preflight requests. Defaults to:
```python
CORS_ALLOW_HEADERS = (
"accept",
"authorization",
"content-type",
"user-agent",
"x-csrftoken",
"x-requested-with",
)
```
The default can be imported from `spiderweb.constants` so you can extend it with your custom headers. This allows you to keep up to date with any future changes. For example:
```python
from spiderweb.constants import DEFAULT_CORS_ALLOW_HEADERS as default_headers
app = SpiderwebRouter(
cors_allow_headers=(
*default_headers,
"my-custom-header",
)
)
```
## cors_expose_headers
The list of extra HTTP headers to expose to the browser, in addition to the default [safelisted headers](https://developer.mozilla.org/en-US/docs/Glossary/CORS-safelisted_response_header). If non-empty, these are declared in the [`access-control-expose-headers` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Expose-Headers). Defaults to `[]`.
## cors_preflight_max_age
The number of seconds (integer) the browser can cache the preflight response. This sets the [`access-control-max-age` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Max-Age) in preflight responses. If this is 0 (or any falsey value), no max age header will be sent. Defaults to `86400` (one day).
Note: Browsers send [preflight requests](https://developer.mozilla.org/en-US/docs/Glossary/Preflight_request) before certain “non-simple” requests, to check they will be allowed. Read more about it in the [CORS MDN article](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#preflighted_requests).
## cors_allow_credentials
If `True`, cookies will be allowed to be included in cross-site HTTP requests. This sets the [`Access-Control-Allow-Credentials` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/access-control-allow-credentials) in preflight and normal responses. Defaults to `False`.
> [!NOTE]
> The session cookie, by default, uses `Lax` as the security setting, which will prevent the session cookie from being sent cross-domain. If you want to use `cors_allow_credentials`, you will need to change `session_cookie_same_site` to `none` to bypass the security restriction.
## cors_allow_private_network
If `True`, allow requests from sites on “public” IP to this server on a “private” IP. In such cases, browsers send an extra CORS header `access-control-request-private-network`, for which `OPTIONS` responses must contain `access-control-allow-private-network: true`. Defaults to `False`.
Refer to:
- [Local Network Access](https://wicg.github.io/local-network-access/), the W3C Community Draft specification.
- [Private Network Access: introducing preflights](https://developer.chrome.com/blog/private-network-access-preflight/), a blog post from the Google Chrome team.
# A note about CSRF
Most sites will need to take advantage of the Cross-Site Request Forgery protection built into Spiderweb. CORS and CSRF are separate, and Spiderweb wants you to be explicit about how the domains that you work with fit together. If you need to exempt sites from the [`Referer`](https://en.wikipedia.org/wiki/HTTP_referer#Etymology) checking that Spiderweb performs does on secure requests, you can use the `csrf_trusted_origins` setting. For example:
```python
from spiderweb.constants import DEFAULT_CORS_ALLOW_HEADERS as default_headers
app = SpiderwebRouter(
cors_allowed_origins=[
"https://read-only.example.com",
"https://read-and-write.example.com",
],
csrf_trusted_origins=[
"https://read-and-write.example.com",
]
)
```

View File

@ -1,5 +1,3 @@
from spiderweb import HttpResponse
# writing your own middleware
Sometimes you want to run the same code on every request or every response (or both!). Lots of processing happens in the middleware layer, and if you want to write your own, all you have to do is write a quick class and put it in a place that Spiderweb can find it. A piece of middleware only needs two things to be successful:
@ -57,6 +55,54 @@ Unlike `process_request`, returning a value here doesn't change anything. We're
This is a helper function that is available for you to override; it's not often used by middleware, but there are some ([like the pydantic middleware](pydantic.md)) that call `on_error` when there is a validation failure.
## checks
If you want to have runtime verifications that ensure that everything is running smoothly, you can take advantage of Spiderweb's `checks` feature.
> [!TIP]
> If you just want to run startup checks, you can also tie this in with the `UnusedMiddleware` exception, as it'll trigger after the checks run.
A startup check looks like this:
```python
from spiderweb.exceptions import ConfigError
from spiderweb.server_checks import ServerCheck
class MyCheck(ServerCheck):
# You don't have to extract the message out into a top-level
# variable, but it does make testing your middleware easier.
MYMESSAGE = "Something has gone wrong!"
# The function must be called `check` and it takes no args.
def check(self):
if self.server.extra_args.get("mykeyword") != "propervalue":
# Note that we are returning an exception instead of
# raising it. All config errors are collected and then
# raised as a single group of all the errors that
# happened on startup.
# If everything looks good, don't return anything.
return ConfigError(self.MYMESSAGE)
```
> [!TIP]
> You should have one check class per actual check that you want to run, as it will make identifying issues much easier.
You can have as many checks as you'd like, and the base Spiderweb instance is available at `self.server`. All checks must return an exception (**not** raising it!), as they will all be raised at the same time as part of an ExceptionGroup called `StartupErrors`.
To enable your checks, link them to your middleware like this:
```python
class MyMiddleware(SpiderwebMiddleware):
checks = [MyCheck, ADifferentCheck]
def process_request(self, request):
...
```
List as many checks as you need there, and the server will run all of them during startup.
## UnusedMiddleware
```python

View File

@ -80,7 +80,7 @@ This is an example view. There are a few things to note here:
> See [declaring routes](routes.md) for more information.
> [!TIP]
> [!NOTE]
> Every view must accept a `request` object as its first argument. This object contains all the information about the incoming request, including headers, cookies, and more.
>
> There's more that we can pass in, but for now, we'll keep it simple.

View File

@ -58,6 +58,7 @@ class SpiderwebRouter(LocalServerMixin, MiddlewareMixin, RoutesMixin, FernetMixi
cors_expose_headers: Sequence[str] = None,
cors_preflight_max_age: int = 86400,
cors_allow_credentials: bool = False,
cors_allow_private_network: bool = False,
csrf_trusted_origins: Sequence[str] = None,
db: Optional[Database] = None,
templates_dirs: Sequence[str] = None,
@ -101,6 +102,7 @@ class SpiderwebRouter(LocalServerMixin, MiddlewareMixin, RoutesMixin, FernetMixi
self.cors_expose_headers = cors_expose_headers or []
self.cors_preflight_max_age = cors_preflight_max_age
self.cors_allow_credentials = cors_allow_credentials
self.cors_allow_private_network = cors_allow_private_network
self._csrf_trusted_origins = csrf_trusted_origins or []
self.csrf_trusted_origins = [

View File

@ -1,9 +1,11 @@
import re
from urllib.parse import urlsplit, SplitResult
from spiderweb.exceptions import ConfigError
from spiderweb.request import Request
from spiderweb.response import HttpResponse
from spiderweb.middleware import SpiderwebMiddleware
from spiderweb.server_checks import ServerCheck
ACCESS_CONTROL_ALLOW_ORIGIN = "access-control-allow-origin"
ACCESS_CONTROL_EXPOSE_HEADERS = "access-control-expose-headers"
@ -15,6 +17,24 @@ ACCESS_CONTROL_REQUEST_PRIVATE_NETWORK = "access-control-request-private-network
ACCESS_CONTROL_ALLOW_PRIVATE_NETWORK = "access-control-allow-private-network"
class VerifyValidCorsSetting(ServerCheck):
INVALID_BASE_CONFIG = (
"To enable CORS, one of the three primary configurations must be set:"
" `cors_allowed_origins`, `cors_allowed_origin_regexes`, or"
" `cors_allow_all_origins`.",
)
def check(self):
# - `cors_allowed_origins`
# - `cors_allowed_origin_regexes`
# - `cors_allow_all_origins`
if (
not self.server.cors_allowed_origins
and not self.server.cors.allowed_origin_regexes
and not self.server.cors_allow_all_origins
):
return ConfigError(self.INVALID_BASE_CONFIG)
class CorsMiddleware(SpiderwebMiddleware):
# heavily 'based' on https://github.com/adamchainz/django-cors-headers,
# which is provided under the MIT license. This is essentially a direct
@ -22,6 +42,7 @@ class CorsMiddleware(SpiderwebMiddleware):
# around for a long time and it works well. Shoutouts to Otto, Adam, and
# crew for helping make this a complete non-issue in Django for a very long
# time.
checks = [VerifyValidCorsSetting]
def is_enabled(self, request: Request):
return bool(re.match(self.server.cors_urls_regex, request.path))

View File

@ -70,17 +70,28 @@ class CSRFMiddleware(SpiderwebMiddleware):
CSRF_EXPIRY = 60 * 60 # 1 hour
def is_trusted_origin(self, request) -> bool:
origin = request.headers.get("http_origin")
referrer = request.headers.get("http_referer") or request.headers.get("http_referrer")
host = request.headers.get("http_host")
if not origin and not (host == referrer):
return False
if not origin and (host == referrer):
origin = host
for re_origin in self.server.csrf_trusted_origins:
if re.match(re_origin, origin):
return True
return False
def process_request(self, request: Request) -> HttpResponse | None:
if request.method == "POST":
trusted_origin = False
if hasattr(request.handler, "csrf_exempt"):
if request.handler.csrf_exempt is True:
return
if origin := request.headers.get("http_origin"):
for re_origin in self.server.csrf_trusted_origins:
if re.match(re_origin, origin):
trusted_origin = True
csrf_token = (
request.headers.get("X-CSRF-TOKEN")
@ -88,7 +99,7 @@ class CSRFMiddleware(SpiderwebMiddleware):
or request.POST.get("csrf_token")
)
if not trusted_origin:
if not self.is_trusted_origin(request):
if self.is_csrf_valid(request, csrf_token):
return None
else: