Monday, October 20, 2014

Focusing is not Call-by-Push-Value

Ever since I learned about them, I've thought of call-by-push-value and focusing (aka polarization) as essentially two different views of the same problem: they both give a fine-grained decomposition of higher-order effectful programs which permits preserving the full βη-theory of the language.

Until this morning, I had thought that the differences were merely cosmetic, with CBPV arising from Paul Levy's analysis of the relationship between denotational semantics and operational semantics, and focusing arising an analysis of the relationship between operational semantics and proof theory (a lot of people have looked at this, but I learned about it from Noam Zeilberger). Both systems decompose a Moggi-style computational monad into a pair of adjoint operators, which mediate between values and computations (in CBPV) and positive and negative types (in focusing). So I thought this meant that “value type” and “positive type” were synonyms, as were “computation type” and “negative type”.

This morning, I realized I was wrong! Focusing and call-by-push-value make precisely the opposite choices in their treatment of variables! To understand this point, let's first recall the syntax of types for a call-by-push-value (on top) and a polarized (on bottom) calculus.

\begin{mathpar}
\begin{array}{llcl}
\mbox{Value Types} &
X,Y,Z & ::= & \Val{A} \bnfalt 0 \bnfalt X + Y \bnfalt 1 \bnfalt X \times Y \\
\mbox{Computation Types} & 
A,B,C & ::= & \F{X} \bnfalt X \to A \bnfalt \top \bnfalt X \With Y \\[1em]

\mbox{Positive Types} & 
P,Q & ::= & \Down{N} \bnfalt 0 \bnfalt P + Q \bnfalt 1 \bnfalt P \times Q \\
\mbox{Computation Types} & 
M,N & ::= & \Up{P} \bnfalt P \to N \bnfalt \top \bnfalt M \With N \\
\end{array}
\end{mathpar}

At first glance, these two grammars look identical, save only for the renamings $\Val{-} \iff \Down{-}$ and $\F{-} \iff \Up{-}$. But this is misleading! If they are actually the same idea, the reason has to be much more subtle. The reason for this is that the typing judgements for these two systems are actually quite different.

In call-by-push-value, the idea is that $\Val{A}$ is a functor which is left adjoint to $\F{X}$. As a result, values are interpreted in a category of values $\ValueOp$, and computations are interpreted in a category of computations $\CompOp$. The adjunction between values and computations means that the hom-set $\VHom{X}{\Val{A}}$ is ismorphic to the hom-set $\CHom{\F{X}}{A}$. This adjunction gives rise to the two basic judgement forms of call-by-push-value, the value judgement $\judge{\Gamma}{v}{X}$ and the computation judgement $\judgec{\Gamma}{t}{A}$. The idea is that $\interp{\judgev{\Gamma}{v}{X}} \in \VHom{\Gamma}{X}$ and $\interp{\judgec{\Gamma}{t}{A}} \in \CHom{\F{\Gamma}}{A}$.

The key bit is in the interpretation of contexts in computations, so let me highlight that:

$\interp{\judgec{\Gamma}{t}{A}} \in \CHom{\F{\Gamma}}{A}$

Note that we interpret contexts as $\F{\Gamma}$, and so this says that variables refer to values.

However, in a polarized type theory, we observe that positive types are “left-invertible”, and negative types are “right-invertible”. In proof theory, a rule is invertibile when the conclusion implies the premise. For example, the right rule for implication introduction in intuitionistic logic reads

\begin{mathpar}
\inferrule*[]
          {\judgend{\Gamma, S}{T}}
          {\judgend{\Gamma}{S \to T}}
\end{mathpar}

This is invertible because you can prove, as a theorem, that

\begin{mathpar}
\inferrule*[]
          {\judgend{\Gamma}{S \to T}}
          {\judgend{\Gamma, S}{T}}
\end{mathpar}

is an admissible rule of the system. Similarly, sums have a left rule:

\begin{mathpar}
\inferrule*[]
          {\judgend{\Gamma, S}{Q} \\ \judgend{\Gamma, T}{Q}}
          {\judgend{\Gamma, S + T}{Q}}
\end{mathpar}

such that the following two rules are admissible:

\begin{mathpar}
\inferrule*[]
          {\judgend{\Gamma, S + T}{Q}}
          {\judgend{\Gamma, S}{Q}}

\and

\inferrule*[]
          {\judgend{\Gamma, S + T}{Q}}
          {\judgend{\Gamma, T}{Q}}
\end{mathpar}

The key idea behind polarization is that one should specify the calculus modulo the invertible rules. That is, the judgement on the right should fundamentally be a judgement that a term has a positive type, and the hypotheses in the context should be negative. That is, the two primary judgements of a polarized system are the positive introduction judgement

$\judge{\Gamma}{v}{P}$

which explains how introductions for positive types work, and the negative elimination (or spine judgement)

$\spine{\Gamma}{s}{N}{P}$

which explains how eliminations for negative types work. The eliminations for positive types are derived and the introductions for negative types are derived judgements (which end up being rules for pattern matching and lambda-abstractions) which make cut-elimination hold, plus a few book-keeping rules to hook these two judgements together. The critical point is that the grammar for $\Gamma$ consists of negative types:

$\Gamma ::= \cdot \bnfalt \Gamma, x:N$

This is because positive types are (by definition) left-invertible, and so there is no reason to permit them to appear as hypotheses. As a result, the context clearly has a very different character than in call-by-push-value.

I don't have a punchline for this post, in the sense of “and therefore the following weird things happen as a consequence”, but I would be astonished if there weren't some interesting consequences! Both focalization and call-by-push-value teach us that it pays large dividends to pay attention to the fine structure of computation, and it's really surprising that they are apparently not looking at the same fine structure, despite apparently arising from the same dichotomy at the type level.