.. Copyright 2021 Varnish Software SPDX-License-Identifier: BSD-2-Clause See LICENSE file for full text of license .. _whatsnew_upgrading_7.0: %%%%%%%%%%%%%%%%%%%%%%%% Upgrading to Varnish 7.0 %%%%%%%%%%%%%%%%%%%%%%%% PCRE2 ===== The migration from PCRE to PCRE2 led to many changes, starting with a change of build dependencies. See the current installation notes for package dependencies on various platforms. Previously the Just In Time (jit) compilation of regular expressions was always enabled at run time if it was present during the build. From now on jit compilation is enabled by default, but can be disabled with the ``--disable-pcre2-jit`` configure option. Once enabled, jit compilation is merely attempted and failures are ignored since they are not essential. The new ``varnishd`` parameter ``pcre2_jit_compilation`` controls whether jit compilation should be attempted and has no effect if jit support was disabled at configure time. See :ref:`ref_param_pcre2_jit_compilation`. The former parameters ``pcre_match_limit`` and ``pcre_match_limit_recursion`` were renamed to ``pcre2_match_limit`` and ``pcre2_depth_limit``. With older PCRE2 libraries, it is possible to see the depth limit being referred to as recursion limit in error messages. See :ref:`ref_param_pcre2_depth_limit` and :ref:`ref_param_pcre2_depth_limit` for more information. The syntax of regular expression should be the same, but it is possible to run into subtle differences. We are aware one such difference, PCRE2 fails the compilation of unknown escape sequences. For example PCRE interprets ``"\T"`` as ``"T"`` and ignores the escape character, but PCRE2 sees it as a syntax error. The solution is to simply use ``"T"`` and in general remove all spurious escape characters. While PCRE2 can capture named groups and has its own substitution syntax where captured groups can be referred to by position with ``$`` or even by name. The substitution syntax for VCL's ``regsub()`` remains the same and captured groups still require the ``\`` syntax where ``\1`` refers to the first group. For this reason, there shouldn't be changes required to existing VCL, ban expressions, VSL queries, or anything working with regular expression in Varnish, except maybe where PCRE2 seems to be stricter and refuses invalid escape sequences. VMOD authors manipulating ``VCL_REGEX`` arguments should not be affected by this migration if they only use the VRT API. However, the underlying VRE API was substantially changed and the new revision of VRE allowed to cover all the Varnish use cases so that ``libvarnish`` is now the only binary linking *directly* to ``libpcre2-8``. The migration implies that bans persisted in the deprecated persistent storage are no longer compatible and a new deprecated persistent storage should be rebuilt from scratch. Structured Fields numbers ========================= VCL types INTEGER and REAL now map respectively to Structured Fields integer and decimal numbers. Numbers outside of the Structured Fields bounds are no longer accepted by the VCL compiler and the various conversion functions from vmod_std will fail the transaction for numbers out of bounds. The scientific notation is no longer supported, for example ``12.34e+3`` must be spelled out as ``12340`` instead. Memory footprint ================ In order to lower the likelihood of flushing the logs of a single task more than once, the default value for ``vsl_buffer`` was increased to 16kB. This should generally result in better performance with tools like ``varnishlog`` or ``varnishncsa`` except for ``raw`` grouping. To accommodate this extra workspace consumption and add even more headroom on top of it, ``workspace_client`` and ``workspace_backend`` both increased to 96kB by default. The PCRE2 jit compiler produces code that consumes more stack, so the default value of ``thread_pool_stack`` was increased to 80kB, and to 64kB on 32bit systems. If you are relying on default values, this will result in an increase of virtual memory consumption proportional to the number of concurrent client requests and backend fetches being processed. This memory is not accounted for in the storage limits that can be applied. To address a potential head of line blocking scenario with HTTP/2, request bodies are now buffered between the HTTP/2 session (stream 0) and the client request. This is allocated on storage, controlled by the ``h2_rxbuf_storage`` parameter and comes in addition to the existing buffering between a client request and a backend fetch also allocated on storage. The new buffer size depends on ``h2_initial_window_size`` that has a new default value of 65535B to avoid having streams with negative windows. Range requests ============== Varnish only supports bytes units for range requests and always stripped ``Accept-Range`` headers coming from the backend. This is no longer the case for pass transactions. To find out whether an ``Accept-Range`` header came from the backend, the ``obj.uncacheable`` in ``vcl_deliver`` indicates whether this was a pass transaction. When ``http_range_support`` is on, a consistency check is added to ensure the backend doesn't act as a bad gateway. If an unexpected ``Content-Range`` header is received, or if it doesn't match the client's ``Range`` header, it is considered an error and a 503 response is generated instead. If your backend adds spurious ``Content-Range`` headers that you can assess are safe to ignore, you can amend the response in VCL:: sub vcl_backend_response { if (!bereq.http.range) { unset beresp.http.content-range; } } When a consistency check fails, an error is logged with the specific range problem encountered. ACL === The ``acl`` keyword in VCL now supports bit flags: - ``log`` - ``pedantic`` (enabled by default) - ``table`` The global parameter ``vcc_acl_pedantic`` (off by default) was removed, and as a result ACLs are now pedantic by default. TODO: reference to manual. They are also quiet by default, the following ACL declarations are equivalent:: acl { ... } acl -log +pedantic -table { ... } This means that the entry an ACL matched is no longer logged as ``VCL_acl`` by default. To restore the previous default behavior, declare your ACL like this:: acl +log -pedantic { ... } ACLs are optimized for runtime performance by default, which can increase significantly the VCL compilation time with very large ACLs. The ``table`` flag improves compilation time at the expense of runtime performance. See :ref:`vcl-acl`. Changes for developers ====================== Build ----- Building from source requires autoconf 2.69 or newer and automake 1.13 or newer. Neither are needed when building from a release archive since they are already bootstrapped. There is a new ``--enable-workspace-emulator`` configure flag to replace the regular "packed allocation" workspace with a "sparse allocation" alternative. Combined with the Address Sanitizer it can help VMOD authors find memory handling issues like buffer overflows that could otherwise be missed on a regular workspace. ``vdef.h`` ---------- The ``vdef.h`` header is no longer self-contained, it includes ``stddef.h``. Since it is the first header that should be included when working with Varnish bindings, some definitions were promoted to ``vdef.h``: - a fallback for the ``__has_feature()`` macro in its absence - VRT macros for Structured Fields number limits - ``struct txt`` and its companion macros (the macros require ``vas.h`` too) This header is implicitly included by ``vrt.h`` and ``cache.h`` and should not concern VMOD authors. Workspace API ------------- The deprecated functions ``WS_Front()`` and ``WS_Inside()`` are gone, they were replaced by ``WS_Reservation()`` and ``WS_Allocated()``. For this reason ``WS_Assert_Allocated()`` was removed despite not being deprecated, since it became redundant with ``assert(WS_Allocated(...))``. Accessing the workspace front pointer only makes sense during a reservation, that's why ``WS_Front()`` was deprecated in a previous release. It should no longer be needed to access ``struct ws`` fields directly, and everything should be possible with the ``WS_*()`` functions. It even becomes mandatory when the workspace emulator is enabled, the ``struct ws`` fields have different semantics. ``STRING_LIST`` --------------- VMOD authors can no longer take ``STRING_LIST`` arguments in functions or object methods. To work with string fragments, use ``VCL_STRANDS`` instead. As a result the following symbols are gone: - ``VRT_String()`` - ``VRT_StringList()`` - ``VRT_CollectString()`` - ``vrt_magic_string_end`` Functions that used to take a ``STRING_LIST`` in the form of a prototype ending with ``const char *, ...`` now take ``const char *, VCL_STRANDS``: - ``VRT_l_client_identity()`` - ``VRT_l_req_method()`` - ``VRT_l_req_url()`` - ``VRT_l_req_proto()`` - ``VRT_l_bereq_method()`` - ``VRT_l_bereq_url()`` - ``VRT_l_bereq_proto()`` - ``VRT_l_beresp_body()`` - ``VRT_l_beresp_proto()`` - ``VRT_l_beresp_reason()`` - ``VRT_l_beresp_storage_hint()`` - ``VRT_l_beresp_filters()`` - ``VRT_l_resp_body()`` - ``VRT_l_resp_proto()`` - ``VRT_l_resp_reason()`` - ``VRT_l_resp_filters()`` The ``VRT_SetHdr()`` function also used to take a ``STRING_LIST`` and now takes a ``const char *, VCL_STRANDS`` too. But, in addition to this change, it also no longer accepts the special ``vrt_magic_string_unset`` argument. Instead, a new ``VRT_UnsetHdr()`` function was added. The ``VRT_CollectStrands()`` function was renamed to ``VRT_STRANDS_string()``, which was its original intended name. Null sentinels -------------- Two convenience sentinels ``vrt_null_strands`` and ``vrt_null_blob`` were added to avoid ``NULL`` usage. ``VRT_blob()`` returns ``vrt_null_blob`` when the source is null or the length is zero. The null blob has the type ``VRT_NULL_BLOB_TYPE``. libvarnishapi ------------- Deprecated functions ``VSB_new()`` and ``VSB_delete()`` were removed. Use ``VSB_init()`` and ``VSB_fini()`` for static buffers and ``VSB_new_auto()`` and ``VSB_destroy()`` for dynamic buffers. Their removal resulted in bumping the soname to 3.0.0 for libvarnishapi. libvarnish ---------- Other changes were made to libvarnish, those are only available to VMOD authors since they are not exposed by libvarnishapi. VNUM '''' The ``VNUMpfx()`` function was replaced by ``SF_Parse_Number()`` that parses both decimal and integer numbers from RFC8941. In addition there are new ``SF_Parse_Decimal()`` and ``SF_Parse_Integer()`` more specialized functions. ``VNUM_bytes_unit()`` returns an integer and no longer parses factional bytes. New token parsers ``VNUM_uint()`` and ``VNUM_hex()`` were added. The other VNUM functions rely on the new SF functions for parsing, with the same limitations. The following macros define the Structured Fields number bounds: - ``VRT_INTEGER_MIN`` - ``VRT_INTEGER_MAX`` - ``VRT_DECIMAL_MIN`` - ``VRT_DECIMAL_MAX`` VRE ''' The VRE API completely changed in preparation for the PCRE2 migration, in order to funnel all PCRE usage in the Varnish source code through VRE. Similarly to how parameters were renamed, the ``match_recursion`` field from ``struct vre_limits`` was renamed to ``depth``. It has otherwise the same meaning and purpose. Notable breaking changes: - ``VRE_compile()`` signature changed - ``VRE_exec()`` was replaced: - ``VRE_match()`` does simple matching - ``VRE_capture()`` captures matched groups in a ``txt`` array - ``VRE_sub()`` substitute matches with a replacement in a VSB - ``VRE_error()`` prints an error message for all the functions above in a VSB - ``VRE_export()`` packs a usable ``vre_t`` that can be persisted as a byte stream An exported regular expression takes the form of a byte stream of a given size that can be used as-is by the various matching functions. Care should be taken to always maintain pointer alignment of an exported ``vre_t``. The ``VRE_ERROR_NOMATCH`` symbol is now hard-linked like ``VRE_CASELESS``, and ``VRE_NOTEMPTY`` is no longer supported. There are no match options left in the VRE facade but the ``VRE_match()``, ``VRE_capture()`` and ``VRE_sub()`` functions still take an ``options`` argument to keep the ability of allowing match options in the future. The ``VRE_ERROR_LEN`` gives a size that should be safe to avoid truncated error messages in a static buffer. To gain full access to PCRE2 features from a regular expression provided via ``vre_t`` a backend-specific ``vre_pcre2.h`` contains a ``VRE_unpack()`` function. This opens for example the door to ``pcre2_substitute()`` with the PCRE2 substitution syntax and named capture groups as an alternative to VCL's ``regsub()`` syntax backed by ``VRE_sub()``. Ideally, ``vre_pcre2.h`` will be the only breaking change next time we move to a different regular expression engine. Hopefully not too soon. *eof*