Merge pull request #6 from itsthejoker/gzip

Updates to gzip middleware
This commit is contained in:
Joe Kaufeld 2024-10-30 00:34:08 -04:00 committed by GitHub
commit 24ee9bdda2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 210 additions and 40 deletions

View file

@ -57,7 +57,7 @@ Unlike `process_request`, returning a value here doesn't change anything. We're
This is a helper function that is available for you to override; it's not often used by middleware, but there are some ([like the pydantic middleware](middleware/pydantic.md)) that call `on_error` when there is a validation failure.
## post_process(self, request: Request, response: HttpResponse, rendered_response: str) -> str:
## post_process(self, request: Request, response: HttpResponse, rendered_response: str) -> str | bytes:
> New in 1.3.0!
@ -67,7 +67,7 @@ There are three things passed to `post_process`:
- `request`: the request object. It's provided here purely for reference purposes; while you can technically change it here, it won't have any effect on the response.
- `response`: the response object. The full HTML of the response has already been rendered, but the headers can still be modified here. This object can be modified in place, like in `process_response`.
- `rendered_response`: the full HTML of the response as a string. This is the final output that will be sent to the client. Every instance of `post_process` must return the full HTML of the response, so if you want to make changes, you'll need to return the modified string.
- `rendered_response`: the full HTML of the response as a string or bytes. This is the final output that will be sent to the client. Every instance of `post_process` must return the full HTML of the response, so if you want to make changes, you'll need to return the modified string. A string is _strongly_ preferred, but bytes are also acceptable; keep in mind that you'll be making things harder for any `post_process` middleware that comes after you.
Note that this function *must* return the full HTML of the response (provided at the start as `rendered_response`. Each invocation of `post_process` overwrites the entire output of the response, so make sure to return everything that you want to send. For example, here's a middleware that ~~breaks~~ adjusts the capitalization of the response and also demonstrates passing variables into the middleware and modifies the headers with the type of transformation:
@ -75,13 +75,13 @@ Note that this function *must* return the full HTML of the response (provided at
import random
from spiderweb.request import Request
from spiderweb.response import HttpResponse
from spiderweb.middleware import SpiderwebMiddleware
from spiderweb.exceptions import ConfigError
class CaseTransformMiddleware(SpiderwebMiddleware):
# this breaks everything, but it's hilarious so it's worth it.
# Blame Sam.
# this breaks everything, but it's hilarious so it's worth it. Blame Sam.
def post_process(self, request: Request, response: HttpResponse, rendered_response: str) -> str:
valid_options = ["spongebob", "random"]
# grab the value from the extra data passed into the server object
@ -109,6 +109,7 @@ class CaseTransformMiddleware(SpiderwebMiddleware):
)
# usage:
from spiderweb import SpiderwebRouter
app = SpiderwebRouter(
middleware=["CaseTransformMiddleware"],

View file

@ -1,20 +1,39 @@
# Gzip compress middleware
# gzip compression middleware
> New in 1.4.0!
```python
from spiderweb import SpiderwebRouter
app = SpiderwebRouter(
middleware=["spiderweb.middleware.gzip"],
middleware=["spiderweb.middleware.gzip.GzipMiddleware"],
)
```
When your response is big, you maybe want to reduce traffic between
server and client.
Gzip will help you. This middleware do not cover all possibilities of content compress. Brotli, deflate, zsts or other are out of scope.
If your app is serving large responses, you may want to compress them. We don't (currently) have built-in support for Brotli, deflate, zstd, or other compression methods, but we do support gzip. (Want to add support for other methods? We'd love to see a PR!)
This version only check if gzip method is accepted by client, size of content is greater than 500 bytes. Check if response is not already compressed and response status is between 200 and 300.
The implementation in Spiderweb is simple: it compresses the response body if the client indicates that it is supported. If the client doesn't support gzip, the response is sent uncompressed. Compression happens at the end of the response cycle, so it won't interfere with other middleware.
Error responses and responses with status codes that indicate that the response body should not be sent (like 204, 304, etc.) will not be compressed. Responses with a `Content-Encoding` header already set (e.g. if you're serving pre-compressed files) will be handled the same way.
> [!NOTE]
> Minimal required version is 1.3.1
The available configuration options are:
## gzip_minimum_response_length
The minimum size in bytes of a response before it will be compressed. Defaults to `500`. Responses smaller than this will not be compressed.
```python
app = SpiderwebRouter(
gzip_minimum_response_length=1000
)
```
## gzip_compression_level
The level of compression to use. Defaults to `6`. This is a number between 0 and 9, where 0 is no compression and 9 is maximum compression. Higher levels will result in smaller files, but will take longer to compress and decompress. Level 6 is a good balance between file size and speed.
```python
app = SpiderwebRouter(
gzip_compression_level=9
)
```

View file

@ -15,6 +15,7 @@ from spiderweb.response import (
app = SpiderwebRouter(
templates_dirs=["templates"],
middleware=[
"spiderweb.middleware.gzip.GzipMiddleware",
"spiderweb.middleware.cors.CorsMiddleware",
"spiderweb.middleware.sessions.SessionMiddleware",
"spiderweb.middleware.csrf.CSRFMiddleware",

View file

@ -1,6 +1,6 @@
[tool.poetry]
name = "spiderweb-framework"
version = "1.3.1"
version = "1.4.0"
description = "A small web framework, just big enough for a spider."
authors = ["Joe Kaufeld <opensource@joekaufeld.com>"]
readme = "README.md"

View file

@ -69,6 +69,8 @@ class SpiderwebRouter(LocalServerMixin, MiddlewareMixin, RoutesMixin, FernetMixi
csrf_trusted_origins: Sequence[str] = None,
db: Optional[Database] = None,
debug: bool = False,
gzip_compression_level: int = 6,
gzip_minimum_response_length: int = 500,
templates_dirs: Sequence[str] = None,
middleware: Sequence[str] = None,
append_slash: bool = False,
@ -119,6 +121,9 @@ class SpiderwebRouter(LocalServerMixin, MiddlewareMixin, RoutesMixin, FernetMixi
convert_url_to_regex(i) for i in self._csrf_trusted_origins
]
self.gzip_compression_level = gzip_compression_level
self.gzip_minimum_response_length = gzip_minimum_response_length
self.debug = debug
self.extra_data = kwargs

View file

@ -63,7 +63,7 @@ class MiddlewareMixin:
def post_process_middleware(
self, request: Request, response: HttpResponse, rendered_response: str
) -> str:
) -> str | bytes:
# run them in reverse order, same as process_response. The top of the middleware
# stack should be the first and last middleware to run.
for middleware in reversed(self.middleware):

View file

@ -1,6 +1,11 @@
from typing import TYPE_CHECKING
from spiderweb.request import Request
from spiderweb.response import HttpResponse
if TYPE_CHECKING:
from spiderweb.server_checks import ServerCheck
class SpiderwebMiddleware:
"""
@ -22,6 +27,10 @@ class SpiderwebMiddleware:
def __init__(self, server):
self.server = server
# If there are any startup checks that need to be run, they should be added
# to this list. These checks should be classes that inherit from
# spiderweb.server_checks.ServerCheck.
self.checks: list[ServerCheck]
def process_request(self, request: Request) -> HttpResponse | None:
# This method is called before the request is passed to the view. You can safely
@ -45,5 +54,6 @@ class SpiderwebMiddleware:
self, request: Request, response: HttpResponse, rendered_response: str
) -> str:
# This method is called after all the middleware has been processed and receives
# the final rendered response in str form. You can modify the response here.
# the final rendered response in str form. You can modify the response here. This
# method *must* return a str version of the rendered response.
return rendered_response

View file

@ -1,36 +1,70 @@
"""
Source code inspiration: https://github.com/colour-science/flask-compress/blob/master/flask_compress/flask_compress.py
"""
from spiderweb.exceptions import ConfigError
from spiderweb.middleware import SpiderwebMiddleware
from spiderweb.server_checks import ServerCheck
from spiderweb.request import Request
from spiderweb.response import HttpResponse
import gzip
class CheckValidGzipCompressionLevel(ServerCheck):
INVALID_GZIP_COMPRESSION_LEVEL = (
"`gzip_compression_level` must be an integer between 1 and 9."
)
def check(self):
if not isinstance(self.server.gzip_compression_level, int):
raise ConfigError(self.INVALID_GZIP_COMPRESSION_LEVEL)
if self.server.gzip_compression_level not in range(1, 10):
raise ConfigError(
"Gzip compression level must be an integer between 1 and 9."
)
class CheckValidGzipMinimumLength(ServerCheck):
INVALID_GZIP_MINIMUM_LENGTH = "`gzip_minimum_length` must be a positive integer."
def check(self):
if not isinstance(self.server.gzip_minimum_response_length, int):
raise ConfigError(self.INVALID_GZIP_MINIMUM_LENGTH)
if self.server.gzip_minimum_response_length < 1:
raise ConfigError(self.INVALID_GZIP_MINIMUM_LENGTH)
class GzipMiddleware(SpiderwebMiddleware):
checks = [CheckValidGzipCompressionLevel, CheckValidGzipMinimumLength]
algorithm = "gzip"
minimum_length = 500
def post_process(self, request: Request, response: HttpResponse, rendered_response: str) -> str:
#right status, length > 500, instance string (because FileResponse returns list of bytes ,
# not already compressed, and client accepts gzip
if not (200 <= response.status_code < 300) or \
len(rendered_response) < self.minimum_length or \
not isinstance(rendered_response, str) or \
self.algorithm in response.headers.get("Content-Encoding", "") or \
self.algorithm not in request.headers.get("Accept-Encoding", ""):
def post_process(
self, request: Request, response: HttpResponse, rendered_response: str
) -> str | bytes:
# Only actually compress the response if the following attributes are true:
#
# - The response status code is a 2xx success code
# - The response length is at least 500 bytes
# - The response is not a streaming response
# - (already bytes, like from FileResponse)
# - The response is not already compressed
# - The request accepts gzip encoding
if (
not (200 <= response.status_code < 300)
or len(rendered_response) < self.server.gzip_minimum_response_length
or not isinstance(rendered_response, str)
or self.algorithm in response.headers.get("Content-Encoding", "")
or self.algorithm not in request.headers.get("Accept-Encoding", "")
):
return rendered_response
zipped = gzip.compress(rendered_response.encode('UTF-8'))
zipped = gzip.compress(
rendered_response.encode("UTF-8"),
compresslevel=self.server.gzip_compression_level,
)
response.headers["Content-Encoding"] = self.algorithm
response.headers["Content-Length"] = str(len(zipped))
return zipped

View file

@ -24,6 +24,11 @@ from spiderweb.tests.views_for_tests import (
form_view_without_csrf,
text_view,
unauthorized_view,
file_view,
)
from spiderweb.middleware.gzip import (
CheckValidGzipMinimumLength,
CheckValidGzipCompressionLevel,
)
@ -349,6 +354,98 @@ def test_unused_post_process_middleware():
assert len(app.middleware) == 0
class TestGzipMiddleware:
middleware = {"middleware": ["spiderweb.middleware.gzip.GzipMiddleware"]}
def test_not_enabled_on_small_response(self):
app, environ, start_response = setup(
**self.middleware,
gzip_minimum_response_length=500,
)
app.add_route("/", text_view)
environ["HTTP_USER_AGENT"] = "hi"
environ["REMOTE_ADDR"] = "/"
environ["REQUEST_METHOD"] = "GET"
assert app(environ, start_response) == [bytes("Hi!", DEFAULT_ENCODING)]
assert "Content-Encoding" not in start_response.get_headers()
def test_changing_minimum_response_length(self):
app, environ, start_response = setup(
**self.middleware,
gzip_minimum_response_length=1,
)
app.add_route("/", text_view)
environ["HTTP_ACCEPT_ENCODING"] = "gzip"
environ["HTTP_USER_AGENT"] = "hi"
environ["REMOTE_ADDR"] = "/"
environ["REQUEST_METHOD"] = "GET"
assert str(app(environ, start_response)[0]).startswith("b'\\x1f\\x8b\\x08")
assert "content-encoding" in start_response.get_headers()
def test_not_enabled_on_error_response(self):
app, environ, start_response = setup(
**self.middleware,
gzip_minimum_response_length=1,
)
app.add_route("/", unauthorized_view)
environ["HTTP_ACCEPT_ENCODING"] = "gzip"
environ["HTTP_USER_AGENT"] = "hi"
environ["REMOTE_ADDR"] = "/"
environ["REQUEST_METHOD"] = "GET"
assert app(environ, start_response) == [bytes("Unauthorized", DEFAULT_ENCODING)]
assert "content-encoding" not in start_response.get_headers()
def test_not_enabled_on_bytes_response(self):
app, environ, start_response = setup(
**self.middleware,
gzip_minimum_response_length=1,
)
# send a file that's already in bytes form
app.add_route("/", file_view)
environ["HTTP_ACCEPT_ENCODING"] = "gzip"
environ["HTTP_USER_AGENT"] = "hi"
environ["REMOTE_ADDR"] = "/"
environ["REQUEST_METHOD"] = "GET"
assert app(environ, start_response) == [bytes("hi", DEFAULT_ENCODING)]
assert "content-encoding" not in start_response.get_headers()
def test_invalid_response_length(self):
class FakeServer:
gzip_minimum_response_length = "asdf"
with pytest.raises(ConfigError) as e:
CheckValidGzipMinimumLength(server=FakeServer).check()
assert (
e.value.args[0] == CheckValidGzipMinimumLength.INVALID_GZIP_MINIMUM_LENGTH
)
def test_negative_response_length(self):
class FakeServer:
gzip_minimum_response_length = -1
with pytest.raises(ConfigError) as e:
CheckValidGzipMinimumLength(server=FakeServer).check()
assert (
e.value.args[0] == CheckValidGzipMinimumLength.INVALID_GZIP_MINIMUM_LENGTH
)
def test_bad_compression_level(self):
class FakeServer:
gzip_compression_level = "asdf"
with pytest.raises(ConfigError) as e:
CheckValidGzipCompressionLevel(server=FakeServer).check()
assert (
e.value.args[0]
== CheckValidGzipCompressionLevel.INVALID_GZIP_COMPRESSION_LEVEL
)
class TestCorsMiddleware:
# adapted from:
# https://github.com/adamchainz/django-cors-headers/blob/main/tests/test_middleware.py

View file

@ -1,7 +1,6 @@
from spiderweb import HttpResponse
from spiderweb.decorators import csrf_exempt
from spiderweb.response import JsonResponse, TemplateResponse
from spiderweb.response import JsonResponse, TemplateResponse, FileResponse
EXAMPLE_HTML_FORM = """
<form action="" method="post">
@ -47,3 +46,7 @@ def text_view(request):
def unauthorized_view(request):
return HttpResponse("Unauthorized", status_code=401)
def file_view(request):
return FileResponse("spiderweb/tests/staticfiles/file_for_testing_fileresponse.txt")