## Section2Preliminaries

### Subsection2.1Stratified groups and homogeneous distances

###### Definition2.1.

A stratified group is a Lie group $G$ whose Lie algebra has a decomposition $\mathfrak{g}=V_1\oplus V_2\oplus\dots\oplus V_s$ such that $V_s\neq \{0\}$ and $[V_1,V_k] = V_{k+1}$ for all $k=1,\dots,s\text{,}$ with the convention that $V_{s+1}=\{0\}\text{.}$ The rank and step of the stratified group $G$ are the integers $r=\dim V_1$ and $s$ respectively.

###### Definition2.2.

A dilation by a factor $\dilationfactor\in\RR$ on a stratified group $G$ is the Lie group automorphism $\delta_{\dilationfactor}\colon G\to G$ defined for any $X=X_1+\dots+X_s\in V_1\oplus\dots\oplus V_s$ by

\begin{equation*} \delta_{\dilationfactor}\exp(X_1+X_2+\dots+X_s) = \exp(\dilationfactor X_1 + \dilationfactor^2X_2+\dots+\dilationfactor^sX_s)\text{.} \end{equation*}
###### Definition2.3.

A homogeneous distance on a stratified group $G$ is a left-invariant distance $d\text{,}$ which is one-homogeneous with respect to the dilations, i.e., which satisfies

\begin{equation*} d(\delta_{\dilationfactor}(g),\delta_{\dilationfactor}(h)) = \dilationfactor d(g,h)\quad\forall \dilationfactor>0,\, \forall g,h\in G\text{.} \end{equation*}

### Subsection2.2The projection norm

###### Definition2.4.

Let $G$ be a stratified group and let $d$ be a homogeneous distance on $G\text{.}$ The projection norm associated with the homogeneous distance $d$ is the function

\begin{equation*} \norm{\cdot}_d\colon V_1\to\RR,\quad \norm{X}_d = d(e,\exp(X))\text{,} \end{equation*}

where $e$ is the identity element of the group $G\text{.}$

It is not immediate that $\norm{\cdot}_d$ defines a norm. In the setting of the Heisenberg groups, this is proved in Proposition 2.8 of . Their proof works with minor modification for any homogeneous distances in arbitrary stratified groups and is captured in the following lemmas. The triangle inequality of $\norm{\cdot}_d$ is the only non-trivial part. In order to make use of the triangle inequality of the distance $d\text{,}$ the following distance estimate is used.

Observe first that for any $X\in V_1$ and $Y=Y_2+\dots+Y_s\in V_2\oplus\dots\oplus V_s=[\mathfrak{g},\mathfrak{g}]\text{,}$ and any $n\in\NN\text{,}$ homogeneity and the triangle inequality imply that

\begin{align*} nd(e,\exp(X+\frac{1}{n}Y_2 + \dots+\frac{1}{n^{s-1}}Y_s)) &= d(e,\exp(nX+nY))\\ &\leq nd(e,\exp(X+Y))\text{.} \end{align*}

Continuity of the distance then gives the bound

\begin{align*} d(e,\exp(X)) &= \lim\limits_{n\to\infty} d(e,\exp(X+\frac{1}{n}Y_2 + \dots+\frac{1}{n^{s-1}}Y_s))\\ &\leq d(e,\exp(X+Y)) \end{align*}

for any $X\in V_1$ and $Y\in [\mathfrak{g},\mathfrak{g}]$ as claimed.

The previous estimate implies the containment $\pi(B(e,r)) \subset B_{\norm{\cdot}_d}(0,r)$ for the projection of any ball $B(e,r)\subset G\text{.}$ On the other hand, Definition 2.4 of the projection norm directly implies the opposite containment

\begin{equation*} B_{\norm{\cdot}_d}(0,r) = V_1\cap \log B(e,r) \subset \pi(B(e,r))\text{.} \end{equation*}

By left-invariance of the distance $d$ it follows that the map $\pi$ is a submetry.

###### Remark2.6.

The estimate of Lemma 2.5 is in general false for non-homogeneous left-invariant distances. Examples of the failure may be found by taking any homogeneous metric and tilting the decomposition $V_1\oplus [\mathfrak{g},\mathfrak{g}]\text{.}$

Namely, let $d$ be any homogeneous distance on a stratified group $G$ for which the inequality of Lemma 2.5 is strict when $Y\neq 0\text{,}$ for instance a sub-Riemannian distance. Define a new stratification for the Lie group $G$ by replacing the first layer $V_1$ with $\tilde{V}_1\text{,}$ where some vector $X\in V_1$ is replaced by $X+Y$ for some central vector $Y\in [\mathfrak{g},\mathfrak{g}]\text{.}$ As the Lie brackets and group law are unchanged, $d$ is a left-invariant distance for the resulting stratified group $\tilde{G}\text{,}$ but is no longer homogeneous due to the tilting of the layers. In this way, the notion of projection to the first layer is changed, and the estimate of Lemma 2.5 fails in $(\tilde{G},d)$ for the vectors $X+Y\in\tilde{V_1}$ and $-Y\in[\mathfrak{g},\mathfrak{g}]\text{.}$

Positivity and homogeneity of the projection norm $\norm{\cdot}_d$ follow immediately from positivity and homogeneity of the homogeneous distance $d\text{.}$ For the triangle inequality, let $X,X'\in V_1$ and let $Y\in [\mathfrak{g},\mathfrak{g}]$ be the element given by the Baker-Campbell-Hausdorff formula such that

\begin{equation*} \exp(X)\exp(X') = \exp(X+X'+Y)\text{.} \end{equation*}

Lemma 2.5 gives the bound $\norm{X+X'}_d\leq d(e,\exp(X+X'+Y))\text{.}$ By the choice of $Y\text{,}$ the left-invariance and triangle inequality of $d$ conclude the claim:

\begin{equation*} d(e,\exp(X+X'+Y))= d(e,\exp(X)\exp(X'))\leq \norm{X}_d+\norm{X'}_d\text{.} \end{equation*}

### Subsection2.3Length structures and sub-Finsler Carnot groups

###### Definition2.8.

Let $(X,d)$ be a metric space. Let $\Omega$ be the space of rectifiable curves of $X$ and let $\ell_d\colon\Omega\to \RR$ be the length functional. For points $x,y\in X\text{,}$ denote by $\Omega(x,y)\subset\Omega$ the space of all rectifiable curves connecting the points $x$ and $y\text{.}$ The length metric associated with the metric $d$ is the map $d_\ell\colon X\times X\to\RR\cup\{\infty\}$ defined by

\begin{equation*} d_\ell(x,y) := \inf \{\ell_d(\gamma): \gamma\in \Omega(x,y) \}\text{.} \end{equation*}

If $d=d_\ell\text{,}$ then the metric $d$ is called a length metric.

See Section 2.3 of  for further information about length structures induced by metrics. For the purposes of this paper, only the special case of the length metric associated with a homogeneous distance will be relevant. Such a length metric always determines a sub-Finsler Carnot group, see Definition 2.10 and Lemma 5.1.

###### Definition2.9.

Let $G$ be a stratified group. Denote by $L_g\colon G\to G$ the left-translation $L_g(h)=gh\text{.}$ An absolutely continuous curve $\gamma\colon [0,T]\to G$ is a horizontal curve if $(L_{\gamma(t)^{-1}})_*\dot{\gamma}(t) \in V_1$ for almost every $t\in [0,T]\text{.}$ The control of a horizontal curve $\gamma$ is its left-trivialized derivative, i.e., the map

\begin{equation*} u\colon [0,T]\to V_1,\quad u(t) = (L_{\gamma(t)^{-1}})_*\dot{\gamma}(t)\text{.} \end{equation*}
###### Definition2.10.

A sub-Finsler Carnot group is a stratified group $G$ equipped with a norm $\norm{\cdot}\colon V_1\to\RR\text{.}$ The norm induces a homogeneous distance $d_{SF}$ via the length structure induced by $\norm{\cdot}$ over horizontal curves.

More explicitly, for a horizontal curve $\gamma\colon[0,T]\to G$ with control $u\colon [0,T]\to V_1\text{,}$ define the length

\begin{equation*} \ell_{\norm{\cdot}}(\gamma) = \int_{0}^{T}\norm{u(t)}\,dt\text{.} \end{equation*}

For $g,h\in G\text{,}$ let $\Omega(g,h)$ be the family of all horizontal curves connecting $g$ and $h\text{.}$ The sub-Finsler distance $d_{SF}$ is defined as

\begin{equation*} d_{SF}(g,h) := \inf \{ \ell_{\norm{\cdot}}(\gamma): \gamma\in \Omega(g,h) \}\text{.} \end{equation*}

### Subsection2.4Geodesics and blowdowns

###### Definition2.11.

Let $G$ be a stratified group equipped with a homogeneous distance $d\text{.}$ A geodesic is an isometric embedding $\gamma\colon[0,T]\to (G,d)\text{.}$ That is, a geodesic satisfies

\begin{equation*} d(\gamma(t),\gamma(s)) = \abs{t-s}\quad\forall t,s\in[0,T]\text{.} \end{equation*}

In the proof of Theorem 1.3 it will be convenient to consider also curves which preserve distances up to a constant factor. A curve $\gamma\colon[0,T]\to (G,d)$ for which there exists some constant $C>0$ such that

\begin{equation*} d(\gamma(t),\gamma(s)) = C\abs{t-s}\quad\forall t,s\in[0,T] \end{equation*}

will be called a geodesic with speed $C$.

Since the dilations are group homomorphisms, the claim follows directly by the chain rule and Definition 2.9 of a control:

\begin{equation*} \frac{d}{dt}\gamma_{\dilationfactor}(t) = (\delta_{1/\dilationfactor})_*\frac{d}{dt}\gamma(\dilationfactor t) = (\delta_{1/\dilationfactor})_*(L_{\gamma(\dilationfactor t)})_*u(\dilationfactor t)\dilationfactor = (L_{\gamma_{\dilationfactor}(t)})_*u_{\dilationfactor}(t)\text{.} \end{equation*}
###### Definition2.13.

Let $\gamma\colon[0,\infty)\to G$ be a horizontal curve. Suppose for some sequence of scales $\dilationfactor_k\to\infty$ the pointwise limit

\begin{equation*} \tilde{\gamma}\colon[0,\infty)\to G,\quad \tilde{\gamma}(t) = \lim\limits_{k\to\infty}\gamma_{\dilationfactor_k}(t) = \lim\limits_{k\to\infty}\delta_{1/\dilationfactor_k}\gamma(\dilationfactor_k t) \end{equation*}

exists. Such a curve $\tilde{\gamma}$ is called a blowdown of the curve $\gamma$ along the sequence of scales $\dilationfactor_k\text{.}$

###### Remark2.14.
If the curve $\gamma$ is $L$-Lipschitz, then the curves $\gamma_\dilationfactor$ are also all $L$-Lipschitz. Hence by Arzelà-Ascoli, up to taking a subsequence a blowdown along a sequence of scales will always exist.
###### (i).

The curve $\tilde{\gamma}$ is a geodesic as the pointwise limit of geodesics.

###### (ii).

The claim follows from Remark 3.13 of . The point is that by weak compactness of closed balls in $\Lloc([0,\infty);V_1)$ there exists a weakly convergent subsequence $u_\dilationfactor \rightharpoonup v$ to some $v\in\Lloc([0,\infty);V_1)\text{.}$ The definitions of control and weak convergence imply that $v$ is a control for $\tilde{\gamma}\text{,}$ so in particular $\tilde{u}(t)=v(t)$ for almost every $t\text{.}$ Finally, the geodesic assumption implies that $\norm{u(t)}\equiv 1 \equiv \norm{\tilde{u}(t)}\text{,}$ so the weak convergence is upgraded to strong convergence $u_{\dilationfactor}\to \tilde{u}$ in $\Lloc([0,\infty);V_1)\text{.}$

If the geodesic $\gamma$ is itself affine, then the claim is immediate. Suppose then that $\gamma$ is not affine, i.e., not a left translation of a one parameter subgroup. In particular, the geodesic $\gamma$ has non-constant control. Hence the horizontal projection $\pi\circ\gamma\colon [0,\infty)\to G/[G,G]$ has non-constant derivative, and is also not affine. Since $G/[G,G]$ is a normed space with a strictly convex norm, the projection curve $\pi\circ\gamma$ cannot be a geodesic. Then Theorem 1.4 of  states that there exists a Carnot subgroup $H \lt G$ of lower rank such that all blowdowns of the geodesic $\gamma$ are contained in $H\text{.}$

Let the curve $\beta\colon [0,\infty)\to H \lt G$ be any blowdown. By Lemma 2.15i, $\beta$ is a geodesic. If $\beta$ is also not affine, then iterating the above there exists a Carnot subgroup $K \lt H \lt G$ of even lower rank such that all blowdowns of $\beta$ are in $K\text{.}$ Blowdowns of the geodesic $\beta$ are also blowdowns of the geodesic $\gamma$ by a diagonal argument, so the claim follows by induction, since a Carnot subgroup of rank 1 is just a one parameter subgroup.

### Subsection2.5Subdifferentials

In this section, let $V$ be some fixed finite dimensional vector space and let $E\colon V\to\RR$ be a convex continuous function. In the application in Section 5, the space $V$ will be the horizontal layer $V_1\subset\mathfrak{g}\text{,}$ and the convex function of interest will be a squared norm $\frac{1}{2}\norm{\cdot}^2\text{.}$

###### Definition2.17.

A linear function $a\colon V\to\RR$ is a subdifferential of the function $E$ at a point $Y\in V$ if

\begin{equation*} a(X-Y)\leq E(X)-E(Y)\quad\forall X\in V\text{.} \end{equation*}

The collection of all subdifferentials $a$ at a point $Y\in V$ is denoted $\partial E(Y)\subset V^*\text{.}$

The following lemmas condense the properties of convex functions and their subdifferentials that will be relevant for the article. They will be utilized in the proof of Theorem 1.1 in Section 5. The first lemma is the continuity and compactness of subdifferentials.

Theorem 24.7 of  shows (among other things) that since the set of points $\mathcal{S}=\{Y_i: i\in\NN\}\cup \{Y\}$ is compact, the family of subdifferentials

\begin{equation*} \partial E(\mathcal{S}):= \bigcup_{X\in \mathcal{S}}\{a\in \partial E(X) \} \end{equation*}

is also compact. Hence there exists a converging subsequence $a_k\to a\in\partial E(\mathcal{S})\text{.}$ The claim is concluded by Theorem 24.4 of , which shows that the convergences $Y_k\to Y$ and $a_k\to a$ with $a_k\in\partial E(Y_k)$ imply that $a\in\partial E(Y)\text{.}$

The second lemma is a simple estimate on a subdifferential of the squared norm.

For any points $X,Y\in V$ and any $\epsilon>0\text{,}$ the subdifferential condition $a\in\partial E(Y)$ implies that

\begin{align*} \epsilon a(X) &= a(Y+\epsilon X - Y) \leq E(Y+\epsilon X) - E(Y)\\ &\leq\frac{1}{2}\Big((\norm{Y}+\epsilon\norm{X})^2 - \norm{Y}^2\Big) = \epsilon \norm{X}\norm{Y} + \frac{1}{2}\epsilon ^2\norm{X}^2\text{.} \end{align*}

Letting $\epsilon\to 0$ proves the bound $a(X)\leq \norm{X}\norm{Y}\text{.}$ Repeating the same consideration for $-X\text{,}$ gives the opposite bound $-a(X)\leq \norm{X}\norm{Y}\text{.}$

For the equality $a(Y)=\norm{Y}^2\text{,}$ let $\epsilon>0\text{,}$ and observe that a similar computation as before shows that

\begin{align*} -\epsilon a(Y) &= a((1-\epsilon)Y-Y) \leq E((1-\epsilon)Y)-E(Y)\\ &= \frac{1}{2}((1-\epsilon)^2-1)\norm{Y}^2 = (-\epsilon+\frac{1}{2}\epsilon^2)\norm{Y}^2\text{.} \end{align*}

That is, $a(Y)\geq (1 - \frac{1}{2}\epsilon)\norm{Y}^2\text{.}$ The limit as $\epsilon\to 0$ and the previous upper bound prove the claim.