Django基础：Middleware中间件

中间件是 Django 处理请求和响应的钩子框架。它是一个轻量级的、低层级的“插件”系统，用于全局改变 Django 的输入或输出。如果你需要在响应请求时插入一个自定义功能、参数的时候特别有用。例如被传送到view中的 HttpRequest 对象。或者你想修改view返回的 HttpResponse 对象，这些都可以通过中间件来实现。可能你还想在view执行之前做一些操作，这种情况就可以用 middleware来实现。

一.中间件的概念

一个完整的Django流程如下：

Django默认的Middleware如下：

　　在Django项目中的settings模块中，有一个MIDDLEWARE_CLASSES变量，其中每一个元素就是一个中间件，如下：

MIDDLEWARE = [
    'django.middleware.security.SecurityMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
]

若要启用中间件组件，将其添加到Django配置文件settings.py的MIDDLEWARE配置项列表中。

实际上在Django中可以不使用任何中间件，如果你愿意的话，MIDDLEWARE配置项可以为空。但是强烈建议至少使用CommonMiddleware。

二.Django内置的中间件

Django内置了下面这些中间件，满足了我们一般的需求：

Cache

缓存中间件

如果启用了该中间件，Django会以CACHE_MIDDLEWARE_SECONDS 配置的参数进行全站级别的缓存。

Common

通用中间件

该中间件为我们提供了一些便利的功能：

禁止DISALLOWED_USER_AGENTS中的用户代理访问服务器
自动为URL添加斜杠后缀和www前缀功能。如果配置项 APPEND_SLASH 为True ，并且访问的URL 没有斜杠后缀，在URLconf中没有匹配成功，将自动添加斜杠，然后再次匹配，如果匹配成功，就跳转到对应的url。 PREPEND_WWW 的功能类似。
为非流式响应设置Content-Length头部信息。

作为展示的例子，这里额外贴出它的源代码，位于django.middleware.common模块中，比较简单，很容易读懂和理解：

class CommonMiddleware(MiddlewareMixin):
    """
    去掉了doc
    """
    response_redirect_class = HttpResponsePermanentRedirect

    def process_request(self, request):
        # Check for denied User-Agents
        if 'HTTP_USER_AGENT' in request.META:
            for user_agent_regex in settings.DISALLOWED_USER_AGENTS:
                if user_agent_regex.search(request.META['HTTP_USER_AGENT']):
                    raise PermissionDenied('Forbidden user agent')

        # Check for a redirect based on settings.PREPEND_WWW
        host = request.get_host()
        must_prepend = settings.PREPEND_WWW and host and not host.startswith('www.')
        redirect_url = ('%s://www.%s' % (request.scheme, host)) if must_prepend else ''

        # Check if a slash should be appended
        if self.should_redirect_with_slash(request):
            path = self.get_full_path_with_slash(request)
        else:
            path = request.get_full_path()

        # Return a redirect if necessary
        if redirect_url or path != request.get_full_path():
            redirect_url += path
            return self.response_redirect_class(redirect_url)

    def should_redirect_with_slash(self, request):

        if settings.APPEND_SLASH and not request.path_info.endswith('/'):
            urlconf = getattr(request, 'urlconf', None)
            return (
                not is_valid_path(request.path_info, urlconf) and
                is_valid_path('%s/' % request.path_info, urlconf)
            )
        return False

    def get_full_path_with_slash(self, request):

        new_path = request.get_full_path(force_append_slash=True)
        if settings.DEBUG and request.method in ('POST', 'PUT', 'PATCH'):
            raise RuntimeError(
                "You called this URL via %(method)s, but the URL doesn't end "
                "in a slash and you have APPEND_SLASH set. Django can't "
                "redirect to the slash URL while maintaining %(method)s data. "
                "Change your form to point to %(url)s (note the trailing "
                "slash), or set APPEND_SLASH=False in your Django settings." % {
                    'method': request.method,
                    'url': request.get_host() + new_path,
                }
            )
        return new_path

    def process_response(self, request, response):
        # If the given URL is "Not Found", then check if we should redirect to
        # a path with a slash appended.
        if response.status_code == 404:
            if self.should_redirect_with_slash(request):
                return self.response_redirect_class(self.get_full_path_with_slash(request))

        if settings.USE_ETAGS and self.needs_etag(response):
            warnings.warn(
                "The USE_ETAGS setting is deprecated in favor of "
                "ConditionalGetMiddleware which sets the ETag regardless of "
                "the setting. CommonMiddleware won't do ETag processing in "
                "Django 2.1.",
                RemovedInDjango21Warning
            )
            if not response.has_header('ETag'):
                set_response_etag(response)

            if response.has_header('ETag'):
                return get_conditional_response(
                    request,
                    etag=response['ETag'],
                    response=response,
                )
        # Add the Content-Length header to non-streaming responses if not
        # already set.
        if not response.streaming and not response.has_header('Content-Length'):
            response['Content-Length'] = str(len(response.content))

        return response

    def needs_etag(self, response):
        """Return True if an ETag header should be added to response."""
        cache_control_headers = cc_delim_re.split(response.get('Cache-Control', ''))
        return all(header.lower() != 'no-store' for header in cache_control_headers)

GZip

内容压缩中间件

用于减小响应体积，降低带宽压力，提高传输速度。

该中间件必须位于其它所有需要读写响应体内容的中间件之前。

如果存在下面情况之一，将不会压缩响应内容：

内容少于200 bytes
已经设置了 Content-Encoding 头部属性
请求的 Accept-Encoding 头部属性未包含 gzip.

可以使用 gzip_page()装饰器，为视图单独开启GZip压缩服务。

Conditional GET

有条件的GET访问中间件，很少使用。

Locale

本地化中间件

用于处理国际化和本地化，语言翻译。

Message

消息中间件

基于cookie或者会话的消息功能，比较常用。

Security

安全中间件

django.middleware.security.SecurityMiddleware中间件为我们提供了一系列的网站安全保护功能。主要包括下列所示，可以单独开启或关闭：

SECURE_BROWSER_XSS_FILTER
SECURE_CONTENT_TYPE_NOSNIFF
SECURE_HSTS_INCLUDE_SUBDOMAINS
SECURE_HSTS_PRELOAD
SECURE_HSTS_SECONDS
SECURE_REDIRECT_EXEMPT
SECURE_SSL_HOST
SECURE_SSL_REDIRECT

Session

会话中间件，非常常用。

Site

站点框架。

这是一个很有用，但又被忽视的功能。

它可以让你的Django具备多站点支持的功能。

通过增加一个site属性，区分当前request请求访问的对应站点。

无需多个IP或域名，无需开启多个服务器，只需要一个site属性，就能搞定多站点服务。

Authentication

认证框架

Django最主要的中间件之一，提供用户认证服务。

CSRF protection

提供CSRF防御机制的中间件

`X-Frame-Options`

点击劫持防御中间件

三.自定义中间件

有时候，为了实现一些特定的需求，我们可能需要编写自己的中间件。

传统的方法

五大钩子函数

传统方式自定义中间件其实就是在编写五大钩子函数：

process_request(self,request)
process_response(self, request, response)
process_view(self, request, view_func, view_args, view_kwargs)
process_exception(self, request, exception)
process_template_response(self,request,response)

可以实现其中的任意一个或多个！

钩子函数	执行时机	执行顺序	返回值
process_request	请求刚到来，执行视图之前	配置列表的正序	None或者HttpResponse对象
process_response	视图执行完毕，返回响应时	逆序	HttpResponse对象
process_view	process_request之后，路由转发到视图，执行视图之前	正序	None或者HttpResponse对象
process_exception	视图执行中发生异常时	逆序	None或者HttpResponse对象
process_template_response	视图刚执行完毕，process_response之前	逆序	实现了render方法的响应对象

`process_request()`

签名：process_request(request)

最主要的钩子！

只有一个参数，也就是request请求内容，和视图函数中的request是一样的。所有的中间件都是同样的request，不会发生变化。它的返回值可以是None也可以是HttpResponse对象。返回None的话，表示一切正常，继续走流程，交给下一个中间件处理。返回HttpResponse对象，则发生短路，不继续执行后面的中间件，也不执行视图函数，而将响应内容返回给浏览器。

`process_response()`

签名：process_response(request, response)

最主要的钩子！

有两个参数，request和response。request是请求内容，response是视图函数返回的HttpResponse对象。该方法的返回值必须是一个HttpResponse对象，不能是None。

process_response()方法在视图函数执行完毕之后执行，并且按配置顺序的逆序执行。

`process_view()`

签名：process_view(request, view_func, view_args, view_kwargs)

request ： HttpRequest 对象。
view_func ：真正的业务逻辑视图函数（不是函数的字符串名称）。
view_args ：位置参数列表
view_kwargs ：关键字参数字典

请务必牢记：process_view() 在Django调用真正的业务视图之前被执行，并且以正序执行。当process_request()正常执行完毕后，会进入urlconf路由阶段，并查找对应的视图，在执行视图函数之前，会先执行process_view() 中间件钩子。

这个方法必须返回None 或者一个 HttpResponse 对象。如果返回的是None，Django将继续处理当前请求，执行其它的 process_view() 中间件钩子，最后执行对应的视图。如果返回的是一个 HttpResponse 对象，Django不会调用业务视图，而是执行响应中间件，并返回结果。

`process_exception()`

签名：process_exception(request, exception)

request：HttpRequest对象
exception：视图函数引发的具体异常对象

当一个视图在执行过程中引发了异常，Django将自动调用中间件的 process_exception()方法。 process_exception() 要么返回一个 None ，要么返回一个 HttpResponse 对象。如果返回的是HttpResponse对象，模板响应和响应中间件将被调用，否则进行正常的异常处理流程。

同样的，此时也是以逆序的方式调用每个中间件的 process_exception方法，以短路的机制。

`process_template_response()`

签名：process_template_response(request, response)

request：HttpRequest 对象

response ： TemplateResponse 对象

process_template_response() 方法在业务视图执行完毕后调用。

正常情况下一个视图执行完毕，会渲染一个模板，作为响应返回给用户。使用这个钩子方法，你可以重新处理渲染模板的过程，添加你需要的业务逻辑。

对于 process_template_response()方法，也是采用逆序的方式进行执行的。

钩子方法执行流程

（注：所有图片来自网络，侵删！）

一个理想状态下的中间件执行过程，可能只有process_request()和process_response()方法，其流程如下：

一旦任何一个中间件返回了一个HttpResponse对象，立刻进入响应流程！要注意，未被执行的中间件，其响应钩子方法也不会被执行，这是一个短路，或者说剥洋葱的过程。

如果有process_view方法的介入，那么会变成下面的样子：

总的执行流程和机制如下图所示：

仔细研究一下下面的执行流程，能够加深你对中间件的理解。

实例演示

介绍完了理论，下面通过实际的例子来演示一下。

要注意，之所以被称为传统的方法，是因为这里要导入一个将来会被废弃的父类，也就是：

from django.utils.deprecation import MiddlewareMixin

deprecation是废弃、贬低、折旧、反对的意思，也就是说，这个MiddlewareMixin类将来应该会被删除！

我们看一下MiddlewareMixin的源码：

class MiddlewareMixin:
    def __init__(self, get_response=None):
        self.get_response = get_response
        super().__init__()

    def __call__(self, request):
        response = None
        if hasattr(self, 'process_request'):
            response = self.process_request(request)
        if not response:
            response = self.get_response(request)
        if hasattr(self, 'process_response'):
            response = self.process_response(request, response)
        return response

这个类并没有自己定义五大钩子方法，而是定义了__call__方法，通过hasattr的反射，寻找process_request等钩子函数是否存在，如果存在就执行。它的本质和后面要介绍的Django官网提供的例子，也就是新的写法是一样的！

现在，我们有一个app叫做polls,在其中创建一个middleware.py模块，写入下面代码

class Md1(MiddlewareMixin):
 
    def process_request(self,request):
        print("Md1请求")
 
    def process_response(self,request,response):
        print("Md1返回")
        return response
 
    def process_view(self, request, callback, callback_args, callback_kwargs):
        print("Md1 view")
 
class Md2(MiddlewareMixin):
 
    def process_request(self,request):
        print("Md2请求")
 
    def process_response(self,request,response):
        print("Md2返回")
        return response
 
    def process_view(self, request, callback, callback_args, callback_kwargs):
        print("Md2 view")
        return  HttpResponse("123")

配置settings.py MIDDLEWARE

结果如下：

当没有问题（异常）的时候，我们发现，和正常的执行没有任何区别。

　　当views出错的时候， process_exception才会执行，我们看报错后的执行流程，如下：

　　下面我们来模拟一个错误，执行看流程：

我们给views.py 里面加一个简单的错误，代码如下：

此时Md还是返回“123”，需要在MIDDLEWARE中注释掉

结果：

我们也可以自己设定报错，我们将Mymiddlewares.py 修改如下：

class Md1(MiddlewareMixin):

    def process_request(self, request):
        print("Md1请求")

    def process_response(self, request, response):
        print("Md1返回")
        return response

    def process_view(self, request, callback, callback_args, callback_kwargs):
        print("Md1 view")
        print("path: {}; method: {}; data: {}".format(request.get_full_path(),
                                                      request.method,
                                                      request.body or ''))
        
    def process_exception(self, request, exception):
        print("程序异常时执行")
        return JsonResponse({"msg": exception.args[0], "code": -1})

再次发送请求：

后台是这样的：

四.中间件应用场景

以下是登录验证，限制ip，限制访问频率，修改返回值，打印请求日志例子

# 限制用户访问次数,每60秒不超过5次
# 构建访问者IP池
visit_ip_pool = {}  # 以'ip'地址为键，以访问的网站的时间戳列表作为值形如{'127.0.0.1':[时间戳,...]}


class MyFirstMiddleware(MiddlewareMixin):
    """
    验证ip，用户是否登录
    """
    
    def process_request(self, request):

        print("接收到request请求，视图函数马上执行")
        self.request_time = time.time()

        if request.META['REMOTE_ADDR'] in getattr(settings, "IP_BLACK_LIST", []):
            print('MyFirstMiddleware 引发短路')
            return HttpResponse('MyFirstMiddleware 引发短路')

    def process_response(self, request, response):
        print("MyFirstMiddleware 视图函数执行结束，准备提供响应")

        # ip拦截
        if request.META['REMOTE_ADDR'] in getattr(settings, "IP_BLACK_LIST", []):
            return HttpResponseForbidden('<h1>该IP地址被限制访问！</h1>')

        if request.path in getattr(settings, "URL_WHITE_LIST", []):
            return None  # 如果是白名单的路径，直接跳过如 login

        # 判断用户是否登陆
        if not request.user.is_authenticated:
            return redirect('/login/')

        response = self.get_response(request)
        # Code to be executed for each request/response after
        # the view is called. 视图函数执行后的代码

        self.response_time = time.time()

        # 可以把url消耗时间可以写入日志获取其他地方
        waste_time = self.response_time - self.request_time
        print(waste_time)
        try:
            res = json.loads(response.content.decode())
            res['request_time'] = self.request_time
            res['response_time'] = self.response_time

            print("返回结果:{}".format(res))
            return HttpResponse(json.dumps(res), content_type='application/json')
        except JSONDecodeError:
            return response

    def process_view(self, request, view_func, view_args, view_kwargs):
        print("请求实际函数前执行")
        # 打印请求日志
        print("path: {}; method: {}; data: {}".format(request.get_full_path(),
                                                      request.method,
                                                      request.body or ''))

    def process_exception(self, request, exception):
        print("程序异常时执行")
        return JsonResponse({"msg": exception.args[0], "code": -1})

    def process_template_response(self, request, response):

        # 响应计时器 记录从收到请求到完成响应所花费的时间
        response_time = self.request_time - time.time()
        response.context_data['response_time'] = abs(response_time)
        return response


class VisitLimit(MiddlewareMixin):
    """
    单位时间内限制ip访问频率
    """
    
    def process_request(self, request):
        print('VisitLimit========请求实际函数前执行')
        self.request_time = time.time()

    def process_response(self, request, response):
        print("VisitLimit视图函数执行结束，准备提供响应")

        # 获取用户的访问的ip地址
        ip = request.META.get("REMOTE_ADDR")

        # 获取访问时间
        visit_time = time.time()
        if ip not in visit_ip_pool:
            # 维护字典,将新的ip地址加入字典
            visit_ip_pool[ip] = [visit_time]
        else:
            # 已经存在，则将ip对应值的插入列表开始位置
            visit_ip_pool[ip].insert(0, visit_time)

        # 获取ip_list列表
        ip_list = visit_ip_pool[ip]

        # 计算访问时间差
        lead_time = ip_list[0] - ip_list[-1]
        print('地址:', ip, '访问次数:', len(ip_list), '时间差', lead_time)

        # 两个条件同时成立则，间隔时间在60s内
        while ip_list and lead_time > 60:
            # 默认移除列表中的最后一个元素
            ip_list.pop()

        # 间隔在60s内判断列表的长度即访问的次数是否大于5次
        if len(ip_list) > 5:
            print(len(ip_list))
            return HttpResponse(
                json.dumps({"msg": "对不起，访问过于频繁，将终止你的访问请求...", "code": -1}),
                content_type="application/json")

        print('地址:', ip, '访问次数:', len(ip_list), '时间差', lead_time)

        response = self.get_response(request)
        # Code to be executed for each request/response after
        # the view is called. 视图函数执行后的代码

        self.response_time = time.time()

        # 可以把url消耗时间可以写入日志获取其他地方
        waste_time = self.response_time - self.request_time
        print(waste_time)
        try:
            res = json.loads(response.content.decode())

            res['request_time'] = self.request_time
            res['response_time'] = self.response_time

            print("返回结果:{}".format(res))
            return HttpResponse(json.dumps(res),
                                content_type='application/json')
        except JSONDecodeError:
            return response

import sys
from django.views.debug import technical_500_response

class DebugOnlySuperUser:
    """debug 页面优化，假设你的应用部署到线上了，并且理所当然的配置了 DEBUG = False 。当引发 500 错误后，你想让超级用户仍然看到 DEBUG 页面，而普通用户看不到，可以这样：
    """
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        response = self.get_response(request)
        return response

    def process_exception(self, request, exception):
        if request.user.is_superuser:
            return technical_500_response(request, *sys.exc_info())

五.装饰器与中间件的区别

1、中间件：在视图函数执行之前先去进行处理，在视图函数执行之后再去进行收尾工作。不会区分是哪个视图，所有的视图统统一视同仁，都会执行之前进行处理或请求之后进行处理。

　　在Django创建的时候，Django默认会给我们加6个中间件。“比如session和csrf,在视图函数执行前，我们就需要对它进行处理，可以使用装饰器来做，也可以使用中间件来处理。”

　　2、装饰器：主要是作用域问题。如果给视图函数上面添加装饰器，它能够保证这个视图的方法在执行之前或执行之后被执行。但是它仅仅适用于哪些视图添加装饰器，那些视图会有这些作用。

　　如果是做一个普遍的处理，不去区分视图的话，就用middleware避免编写重复功能的代码，本质上就是一个自定义类，类中定义了几个方法，Django框架会在请求的特定的时间去执行这些方法。

六.注意

process_template_response(request, response) response 是 TemplateResponse 对象，它在视图被执行后调用，必须返回一个实现了 render 方法的响应对象。此钩子方法会在响应阶段按照相反的顺序运行。也就是说，此方法仅当视图返回 TemplateResponse 对象才会被调用，通常用的 render 快捷方式不会触发它。
短路短路中断了中间件向下一级传播(需要理解中间件的执行方式)，直接从 return 位置返回了，甚至请求也不会进入视图函数。这就是中间件的短路，比较常用在权限控制等功能中。