## Papers updated in last 7 days (68 results)

1/0 Shades of UC: Photonic Side-Channel Analysis of Universal Circuits

A universal circuit (UC) can be thought of as a programmable circuit that
can simulate any circuit up to a certain size by specifying its secret configuration bits. UCs have been incorporated into various applications, such as private function evaluation (PFE). Recently, studies have attempted to formalize the concept of semiconductor intellectual property (IP) protection in the context of UCs. This is despite the observations made in theory and practice that, in reality, the adversary may obtain additional information about the secret when executing cryptographic protocols. This paper aims to answer the question of whether UCs leak information unintentionally, which can be leveraged by the adversary to disclose the configuration bits. In this regard, we propose the first photon emission analysis against UCs relying on computer vision-based approaches. We demonstrate that the adversary can utilize a cost-effective solution to take images to be processed by off-the-shelf algorithms to extract configuration bits. We examine the efficacy of our method in two scenarios: (1) the design is small enough to be captured in a single image during the attack phase, and (2) multiple images should be captured to launch the attack by deploying a divide-and-conquer strategy. To evaluate the effectiveness of our attack, we use metrics commonly applied in side-channel analysis, namely rank and success rate. By doing so, we show that our profiled photon emission analysis achieves a success rate of 1 by employing a few templates (concretely, only 18 images were used as templates).

Small Stretch Problem of the DCT Scheme and How to Fix It

DCT is a beyond-birthday-bound~(BBB) deterministic authenticated encryption~(DAE) mode proposed by Forler et al. in ACISP 2016, ensuring integrity by redundancy. The instantiation of DCT employs the BRW polynomial, which is more efficient than the usual polynomial in GCM by reducing half of the multiplication operations. However, we show that DCT suffers from a small stretch problem similar to GCM. When the stretch length $\tau$ is small, choosing a special $m$-block message, we can reduce the number of queries required by a successful forgery to $\mathcal{O}(2^{\tau}/m)$. We emphasize that this attack efficiently balances space and time complexity but does not contradict the security bounds of DCT. Finally, we propose an improved scheme named Robust DCT~(RDCT) with a minor change to DCT, which improves the security when $\tau$ is small and makes it resist the above attack.

Two-Round Threshold Lattice-Based Signatures from Threshold Homomorphic Encryption

Much recent work has developed efficient protocols for threshold signatures, where $n$ parties share a signing key and some threshold $t$ of those parties must interact to produce a signature. Yet efficient threshold signatures with post-quantum security have been elusive, with the state-of-the-art being a two-round scheme by Damgård et al. (PKC'21) based on lattices that supports only the full threshold case (i.e., $t=n$).
We show here a two-round threshold signature scheme based on standard lattice assumptions that supports arbitrary thresholds $t\leq n$. Estimates of our scheme's performance at the $128$-bit security level show that in the 3-out-of-5 case, we obtain signatures of size $46.6$ KB and public keys of size $13.6$ KB. We achieve $\approx 5\times$ improved parameters if only a small number of signatures are ever issued with the same key.
As an essential building block and independent contribution, we construct an actively secure threshold (linearly) homomorphic encryption scheme that supports arbitrary thresholds $t \leq n$.

Scalable Private Signaling

Private messaging systems that use a bulletin board, like privacy-preserving blockchains, have been a popular topic during the last couple of years. In these systems, typically a private message is posted on the board for a recipient and the privacy requirement is that no one can determine the sender and the recipient of the message. Until recently, the efficiency of these recipients was not considered, and the party had to perform a naive scan of the board to retrieve their messages.
More recently, works like Fuzzy Message Detection (FMD), Private Signaling (PS), and Oblivious Message Retrieval (OMR) have studied the problem of protecting recipient privacy by outsourcing the message retrieval process to an untrusted server. However, FMD only provides limited privacy guarantees, and PS and OMR greatly lack scalability.
In this work, we present a new construction for private signaling which is both asymptotically superior and concretely orders of magnitude faster than all prior works while providing full privacy. Our constructions make use of a trusted execution environment (TEE) and an Oblivious RAM to improve the computation complexity of the server. We also improve the privacy guarantees by keeping the recipient hidden even during the retrieval of signals from the server. Our proof-of-concept open-source implementation shows that for a server serving a million recipients and ten million messages, it only takes $< 60$ milliseconds to process a sent message, and $< 6$ seconds to process a retrieval (of 100 signals) request from a recipient.

Multiple Group Action Dlogs with(out) Precomputation

Let $\star: G \times X \rightarrow X$ be the action of a group $G$ of size $N=|G|$ on a set $X$. Let $y = g \star x \in X$ be a group action dlog instance, where our goal is to compute the unknown group element $g \in G$ from the known set elements $x,y \in X$.
The Galbraith-Hess-Smart (GHS) collision finding algorithm solves the group action dlog in $N^{\frac 1 2}$ steps with polynomial memory.
We show that group action dlogs are suitable for precomputation attacks. More precisely, for any $s,t$ our precomputation algorithm computes within $st$ steps a hint of space complexity $s$, which allows to solve any group action dlog in an online phase within $t$ steps. A typical instantiation is $s=t=N^{\frac 1 3}$, which gives precomputation time $N^{\frac 2 3}$ and space $N^{\frac 1 3}$, and online time only $N^{\frac 1 3}$.
Moreover, we show that solving multiple group action dlog instances $y_1, \ldots , y_m$ allows for speedups. Namely, our collision finding algorithm solves $m$ group action dlogs in $\sqrt{m}N^{\frac 1 2}$ steps, instead of the straight-forward $mN^{\frac 1 2}$ steps required for running $m$ times GHS.
Interestingly, our multi instance algorithm (with precomputation) can be seen as a special case of our precomputation algorithm.
Our multiple instance approach can be freely combined with our precomputations, allowing for a variety of tradeoffs.
Technically, our precomputation and multiple instance group action dlog attacks are adaptations of the techniques from the standard dlog setting in abelian groups. While such an adaptation seems natural, it is per se unclear which techniques transfer from the dlog to the more restricted group dlog setting, for which $X$ does not offer a group structure.
Our algorithms have direct implications for all group action based cryptosystems, such as CSIDH and its variants. We provide experimental evidence that our techniques work well in the CSIDH setting.

Leakage-Abuse Attacks Against Structured Encryption for SQL

Structured Encryption (StE) enables a client to securely store and query data stored on an untrusted server. Recent constructions of StE have moved beyond basic queries, and now support large subsets of SQL. However, the security of these constructions is poorly understood, and no systematic analysis has been performed.
We address this by providing the first leakage-abuse attacks against StE for SQL schemes. Our attacks can be run by a passive adversary on a server with access to some information about the distribution of underlying data, a common model in prior work. They achieve partial query recovery against select operations and partial plaintext recovery against join operations. We prove the optimality and near-optimality of two new attacks, in a Bayesian inference framework. We complement our theoretical results with an empirical investigation testing the performance of our attacks against real-world data and show they can successfully recover a substantial proportion of queries and plaintexts.
In addition to our new attacks, we provide proofs showing that the conditional optimality of a previously proposed leakage-abuse attack and that inference against join operations is NP-hard in general.

An overview of symmetric fuzzy PAKE protocols

Fuzzy password authenticated key exchange (fuzzy PAKE) protocols enable two parties to securely exchange a session-key for further communication. The parties only need to share a low entropy password. The passwords do not even need to be identical, but can contain some errors. This may be due to typos, or because the passwords were created from noisy biometric readings. In this paper we provide an overview and comparison of existing fuzzy PAKE protocols. Furthermore, we analyze certain security properties of these protocols and argue that the protocols can be expected to be slightly more secure in practice than can be inferred from their theoretical guarantees.

Understanding binary-Goppa decoding

This paper reviews, from bottom to top, a polynomial-time algorithm to correct t errors in classical binary Goppa codes defined by squarefree degree-t polynomials. The proof is factored through a proof of a simple Reed--Solomon decoder, and the algorithm is simpler than Patterson's algorithm. All algorithm layers are expressed as Sage scripts backed by test scripts. All theorems are formally verified. The paper also covers the use of decoding inside the Classic McEliece cryptosystem, including reliable recognition of valid inputs.

Elastic MSM: A Fast, Elastic and Modular Preprocessing Technique for Multi-Scalar Multiplication Algorithm on GPUs

Zero-knowledge proof (ZKP) is a cryptographic primitive that enables a prover to convince a verifier that a statement is true, without revealing any other information beyond the correctness of the statement itself. Due to its powerful capabilities, its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been widely deployed in various privacy preserving applications such as cryptocurrencies and verifiable computation. Although state-of-the-art zkSNARKs are highly efficient for the verifier, the computational overhead for the prover is still orders of magnitude too high to warrant use in many applications. This overhead arises from several time-consuming operations, including large-scale matrix-vector multiplication (MUL), number-theoretic transform (NTT), and especially the multi-scalar multiplication (MSM) which constitutes the largest proportion. Therefore, further efficiency improvements are needed.
In this paper, we focus on comprehensive optimization of running time and storage space required by the MSM algorithm on GPUs. Specifically, we propose a novel, modular and adaptive parameter configuration technique—elastic MSM to enable us to adjust the scale of MSM according to our own wishes by performing a corresponding amount of preprocessing. This technique enables us to fully unleash the potential of various efficient parallel MSM algorithms. We have implemented and tested elastic MSM over three prevailing parallel Pippenger algorithms on GPUs. Given a range of practical parameters, across various preprocessing space limitations (across various MSM scales), our constructions achieve up to about 90%, 8% and 36% (2.58×, 39% and 91%) speedup versus three state-of-the-art parallel Pippenger algorithms on GPUs, respectively.
From another perspective, elastic MSM could also be regarded as a preprocessing technique over the well-known Pippenger algorithm, which is modular and could be used to accelerate almost all the most advanced parallel Pippenger algorithms on GPUs. Meanwhile, elastic MSM provides an adaptive trade-off between the running time and the extra storage space needed by parallel Pippenger algorithms on GPUs. This is the first preprocessing technique to retain the improved MSM computation brought by preprocessing under varying storage space limitations. Specifically, given a range of practical parameters, across various preprocessing space limitations (across various MSM scales), our constructions achieve up to about 192× and 223× (159× and 174×) speedup versus two state-of-the-art preprocessing parallel Pippenger algorithms on GPUs, respectively.

Weakly Profiling Side-channel Analysis

Profiling side-channel analysis, recognized for its robust attack performance in worst-case scenarios, necessitates adversaries to have a cloned device for profiling measurements and secret information for data labeling. On the other hand, non-profiling attacks eschew these requirements by trying all key guesses. Although more suitable for real-world attack scenarios, they may suffer from mediocre attack performance due to the lack of leakage insight.
This paper introduces a novel weakly profiling side-channel analysis method that bridges classical profiling and non-profiling analyses. Our method operates within a profiling framework yet discards the necessity for a cloned device, which relies on the fact that there is (commonly) a bijective relationship between known information, such as plaintext and ciphertext, and secret information. This relationship allows an adversary to label leakage measurements using known information and then profile leakages directly on the attacked device. The empirical results show that the proposed approach reaches at least $3\times$ better attack performance with negligible computational effort than existing non-profiling methods. Moreover, it can rival the performance of state-of-the-art profiling attacks.

Communication-Efficient Multi-Party Computation for RMS Programs

Despite much progress, general-purpose secure multi-party computation (MPC) with active security may still be prohibitively expensive in settings with large input datasets. This particularly applies to the secure evaluation of graph algorithms, where each party holds a subset of a large graph.
Recently, Araki et al. (ACM CCS '21) showed that dedicated solutions may provide significantly better efficiency if the input graph is sparse. In particular, they provide an efficient protocol for the secure evaluation of "message passing" algorithms, such as the PageRank algorithm. Their protocol's computation and communication complexity are both $\tilde{O}(M\cdot B)$ instead of the $O(M^2)$ complexity achieved by general-purpose MPC protocols, where $M$ denotes the number of nodes and $B$ the (average) number of incoming edges per node. On the downside, their approach achieves only a relatively weak security notion; $1$-out-of-$3$ malicious security with selective abort.
In this work, we show that PageRank can instead be captured efficiently as a restricted multiplication straight-line (RMS) program, and present a new actively secure MPC protocol tailored to handle RMS programs. In particular, we show that the local knowledge of the participants can be leveraged towards the first maliciously-secure protocol with communication complexity linear in $M$, independently of the sparsity of the graph. We present two variants of our protocol. In our communication-optimized protocol, going from semi-honest to malicious security only introduces a small communication overhead, but results in quadratic computation complexity $O(M^2)$. In our balanced protocol, we still achieve a linear communication complexity $O(M)$, although with worse constants, but a significantly better computational complexity scaling with $O(M\cdot B)$. Additionally, our protocols achieve security with identifiable abort and can tolerate up to $n-1$ corruptions.

Amortizing Circuit-PSI in the Multiple Sender/Receiver Setting

Private set intersection (PSI) is a cryptographic functionality for two parties to learn the intersection of their input sets, without leaking any other information. Circuit-PSI is a stronger PSI functionality where the parties learn only a secret-shared form of the desired intersection, thus without revealing the intersection directly. These secret shares can subsequently serve as input to a secure multiparty computation of any function on this intersection.
In this paper we consider several settings in which parties take part in multiple Circuit-PSI executions with the same input set, and aim to amortize communications and computations. To that end, we build up a new framework for Circuit-PSI around generalizations of oblivious (programmable) PRFs that are extended with offline setup phases. We present several efficient instantiations of this framework with new security proofs for this setting. As a side result, we obtain a slight improvement in communication and computation complexity over the state-of-the art Circuit-PSI protocol by Bienstock et al. (USENIX '23). Additionally, we present a novel Circuit-PSI protocol from a PRF with secret-shared outputs, which has linear communication and computation complexity in the parties' input set sizes, and incidentally, it realizes ``almost malicious'' security, making it the first major step in this direction since the protocol by Huang et al. (NDSS '12). Lastly, we derive the potential amortizations over multiple protocol executions, and observe that each of the presented instantiations is favorable in at least one of the multiple-execution settings.

Quantum Implementation and Analysis of SHA-2 and SHA-3

Quantum computers have the potential to solve hard problems that are nearly impossible to solve by classical computers, this has sparked a surge of research to apply quantum technology and algorithm against the cryptographic systems to evaluate for its quantum resistance. In the process of selecting post-quantum standards, NIST categorizes security levels based on the complexity that quantum computers would require to crack AES encryption (levels 1, 3, and 5) and SHA-2 or SHA-3 (levels 2 and 4).
In assessing the security strength of cryptographic algorithms against quantum threats, accurate predictions of quantum resources are crucial. Following the work of Jaques et al. in Eurocrypt 2020, NIST estimated security levels 1, 3, and 5, corresponding to quantum circuit size for finding the key for AES-128, AES-192, and AES-256, respectively. This work has been recently followed-up by Huang et al. (Asiacrypt'22) and Liu et al. (Asiacrypt'23). However, for levels 2 and 4, which relate to the collision finding for the SHA-2 and SHA-3 hash functions, quantum attack complexities are probably not well-studied.
In this paper, we present novel techniques for optimizing the quantum circuit implementations for SHA-2 and SHA-3 algorithms in all the categories specified by NIST. After that, we evaluate the quantum circuits of target cryptographic hash functions for quantum collision search. Finally, we define the quantum attack complexity for levels 2 and 4, and comment on the security strength of the extended level. We present new concepts to optimize the quantum circuits at the component level and the architecture level.

A Near-Linear Quantum-Safe Third-Party Private Set Intersection Protocol

Third-party private set intersection (PSI) enables two parties, each holding a private set to compute their intersection and reveal the result only to an inputless third party. In this paper, we present efficient third-party PSI protocols, which significantly lower the computational workload compared to prior work. Our work is motivated by real-world applications such as contact tracing whereby expedition is essential while concurrently preserving privacy. Our construction attains a near-linear computational complexity of $O(n^{1+\varepsilon})$ for large dataset size $n$, where $\varepsilon>0$ is any fixed constant, and achieves post-quantum security. For a quantum-safe third-party PSI protocol, this significantly improves upon the current known best of $O(n^{2.5+o(1)})$. Our improvements stem from algorithmic changes and the incorporation of new techniques along with precise parameter selections to achieve a tight asymptotic bound.

On the construction of quantum circuits for S-boxes with different criteria based on the SAT solver

The substitution box (S-box) is often used as the only nonlinear component in symmetric-key ciphers, leading to a significant impact on the implementation performance of ciphers in both classical and quantum application scenarios by S-box circuits. Taking the Pauli-X gate, the CNOT gate, and the Toffoli gate (i.e., the NCT gate set) as the underlying logic gates, this work investigates the quantum circuit implementation of S-boxes based on the SAT solver. Firstly, we propose encoding methods of the logic gates and the NCT-based circuit, based on which we construct STP models for implementing S-boxes. By applying the proposed models to the S-boxes of several well-known cryptographic algorithms, we construct optimal implementations with different criteria for the 4-bit S-boxes and provide the implementation bounds of different criteria for the 5-bit S-boxes. Since S-boxes in the same affine equivalence class share most of the important properties, we then build STP models to further investigate optimizing S-box circuits based on affine equivalence. According to the applications, for almost all the tested 4-bit S-boxes, there always exists an equivalent S-box that can be implemented with half the number of logic gates. Besides, we encode some important cryptographic properties and construct STP models to design S-boxes with given criteria configurations on implementation and properties. As an application, we find an S-box with the same cryptographic properties as the S-box of KECCAK that can be implemented with only 5 NCT gates, even though the application of our models indicates that implementing the KECCAK S-box requires more than 9 NCT gates. Notably, the inputs of the proposed models are tweakable, which makes the models possess some functions not currently available in the public tools for constructing optimized NCT-based circuits for S-boxes.

Atlas-X Equity Financing: Unlocking New Methods to Securely Obfuscate Axe Inventory Data Based on Differential Privacy

Banks publish daily a list of available securities/assets (axe list) to selected clients to help them effectively locate Long (buy) or Short (sell) trades at reduced financing rates. This reduces costs for the bank, as the list aggregates the bank's internal firm inventory per asset for all clients of long as well as short trades. However, this is somewhat problematic: (1) the bank's inventory is revealed; (2) trades of clients who contribute to the aggregated list, particularly those deemed large, are revealed to other clients. Clients conducting sizable trades with the bank and possessing a portion of the aggregated asset exceeding $50\%$ are considered to be concentrated clients. This could potentially reveal a trading concentrated client's activity to their competitors, thus providing an unfair advantage over the market.
Atlas-X Axe Obfuscation, powered by new differential private methods, enables a bank to obfuscate its published axe list on a daily basis while under continual observation, thus maintaining an acceptable inventory Profit and Loss (P\&L) cost pertaining to the noisy obfuscated axe list while reducing the clients' trading activity leakage. Our main differential private innovation is a differential private aggregator for streams (time series data) of both positive and negative integers under continual observation.
For the last two years, Atlas-X system has been live in production across three major regions—USA, Europe, and Asia—at J.P. Morgan, a major financial institution, facilitating significant profitability. To our knowledge, it is the first differential privacy solution to be deployed in the financial sector. We also report benchmarks of our algorithm based on (anonymous) real and synthetic data to showcase the quality of our obfuscation and its success in production.

A Note on Related-Tweakey Impossible Differential Attacks

In this short note we review the technique proposed at ToSC 2018 by Sadeghi et al. for attacks built upon several related-tweakey impossible differential trails. We show that the initial encryption queries are improper and lead the authors to misevaluating a filtering value in the key recovery phase. We identified 4 papers (from Eurocrypt, DCC, ToSC and ePrint) that follow on the results of Sadeghi et al., and in three of them the issue was propagated.
We thus present a careful analysis of these types of attacks and give generic complexity formulas similar to the ones proposed by Boura et al. at Asiacrypt 2014. We apply these to the aforementioned papers and provide patched versions of their attacks. The main consequence is an increase in the memory complexity. We show that in many cases (a notable exception being quantum impossible differentials) it is possible to recover the numeric estimates of the flawed analysis, and in all cases we were able to build a correct attack reaching the same number of rounds.

Logstar: Efficient Linear* Time Secure Merge

Secure merge considers the problem of combining two sorted lists into a single sorted secret-shared list. Merge is a fundamental building block for many real-world applications. For example, secure merge can implement a large number of SQL-like database joins, which are essential for almost any data processing task such as privacy-preserving fraud detection, ad conversion rates, data deduplication, and many more.
We present two constructions with communication bandwidth and rounds tradeoff. Logstar, our bandwidth-optimized construction, takes inspiration from Falk and Ostrovsky (ITC, 2021) and runs in $O(n\log^*n)$ time and communication with $O(\log n)$ rounds. In particular, for all conceivable $n$, the $\log^*n$ factor will be equal to the constant $2$, and therefore we achieve a near-linear running time. Median, our rounds-optimized construction, builds on the classic parallel medians-based insecure merge approach of Valiant (SIAM J. Comput., 1975), later explored in the secure setting by Blunk et al. (2022), and requires $O(n \log^c n)$, $1<c<2$, communication with $O(\log \log n)$ rounds.
We introduce two additional constructions that merge input lists of different sizes. SquareRootMerge, merges lists of sizes $n^{\frac{1}{2}}$ and $n$, and runs in $O(n)$ time and communication with $O(\log n)$ rounds. CubeRootMerge is closely inspired by Blunk et al.'s (2022) construction and merges lists of sizes $n^{\frac{1}{3}}$ and $n$. It runs in $O(n)$ time and communication with $O(1)$ rounds.
We optimize our constructions for concrete efficiency. Today, concretely efficient secure merge protocols rely on standard techniques such as GMW or generic sorting. These approaches require an $O(n \log n)$ size circuit of $O(\log n)$ depth. In contrast, our constructions are more efficient and also achieve superior asymptotics. We benchmark our constructions and obtain significant improvements. For example, Logstar reduces bandwidth costs $\approx 3.3\times$ and Median reduces rounds $\approx2.22\times$.

Practical Proofs of Parsing for Context-free Grammars

In this work-in-progress, we present a series of protocols to efficiently prove statements about strings in context-free grammars (CFGs). Our main protocol for proving proofs of correct parsing for strings in a CFG flexibly accommodates different instantiations of zero-knowledge proof systems as well as accumulation schemes. While improvements in the modular cryptographic primitives can be carried over for improvements in our protocols, even simpler proof systems, which do not support state-of-the-art techniques such as permutation checks can generate proofs of correct parsing of a string of size $n$ by proving the correctness of a circuit of size $\mathcal{O}(cn)$, where $c$ is the cost of verifying a set membership proof in the chosen accumulation scheme.

Tight Indistinguishability Bounds for the XOR of Independent Random Permutations by Fourier Analysis

The XOR of two independent permutations (XoP) is a well-known construction for achieving security beyond the birthday bound when implementing a pseudorandom function using a block cipher (i.e., a pseudorandom permutation). The idealized construction (where the permutations are uniformly chosen and independent)
and its variants have been extensively analyzed over nearly 25 years.
The best-known asymptotic information-theoretic indistinguishability bound for the XoP construction is $O(q/2^{1.5n})$, derived by Eberhard in 2017, where $q$ is the number of queries and $n$ is the block length.
A generalization of the XoP construction outputs the XOR of $r \geq 2$ independent permutations, and has also received significant attention in both the single-user and multi-user settings. In particular, for $r = 3$, the best-known bound (obtained by Choi et al. [ASIACRYPT'22]) is about $q^2/2^{2.5 n}$ in the single-user setting and $\sqrt{u} q_{\max}^2/2^{2.5 n}$ in the multi-user setting (where $u$ is the number of users and $q_{\max}$ is the number of queries per user).
In this paper, we prove an indistinguishability bound of $q/2^{(r - 0.5)n}$ for the (generalized) XoP construction in the single-user setting, and a bound of $\sqrt{u} q_{\max}/2^{(r - 0.5)n}$ in the multi-user setting. In particular, for $r=2$, we obtain the bounds $q/2^{1.5n}$ and $\sqrt{u} q_{\max}/2^{1.5n}$ in single-user and multi-user settings, respectively. For $r=3$ the corresponding bounds are $q/2^{2.5n}$ and $\sqrt{u} q_{\max}/2^{2.5n}$. All of these bounds hold assuming $q < 2^{n}/2$ (or $q_{\max} < 2^{n}/2$).
Compared to previous works, we improve all the best-known bounds for the (generalized) XoP construction in the multi-user setting, and the best-known bounds for the generalized XoP construction for $r \geq 3$ in the single-user setting (assuming $q \geq 2^{n/2}$). For the basic two-permutation XoP construction in the single-user setting, our concrete bound of $q/2^{1.5n}$ stands in contrast to the asymptotic bound of $O(q/2^{1.5n})$ by Eberhard.
Since all of our bounds are matched (up to constant factors) for $q > 2^{n/2}$ by attacks published by Patarin in 2008 (and their generalizations to the multi-user setting), they are all tight.
We obtain our results by Fourier analysis of Boolean functions. Most of our technical work involves bounding (sums of) Fourier coefficients of the density function associated with sampling without replacement. While the proof of Eberhard relies on similar bounds, our proof is elementary and simpler.

Efficient Instances of Docked Double Decker With AES, and Application to Authenticated Encryption

A tweakable wide blockcipher is a construction which behaves in the same way as a tweakable blockcipher, with the difference that the actual block size is flexible. Due to this feature, a tweakable wide blockcipher can be directly used as a strong encryption scheme that provides full diffusion when encrypting plaintexts to ciphertexts and vice versa. Furthermore, it can be the basis of authenticated encryption schemes fulfilling the strongest security notions. In this paper, we present two instantiations of the docked double decker tweakable wide blockcipher: $\mathit{ddd}\text{-}\mathit{AES}$ and $\mathit{bbb}\text{-}\mathit{ddd}\text{-}\mathit{AES}$. Both instances exclusively use similar building blocks as AES-GCM (AES and finite field multiplication), are designed for maximal parallelism, and hence, can make efficient use of existing hardware accelerators. Moreover, $\mathit{bbb}\text{-}\mathit{ddd}\text{-}\mathit{AES}$ builds upon a novel conditionally beyond birthday bound secure pseudorandom function, a tweakable variant of the XOR of permutations, facilitating in the need to include a tweak in the AES evaluations without sacrificing flexibility in docked double decker. We furthermore introduce an authenticated encryption mode $\mathit{aaa}$ specifically tailored to be instantiated with $\mathit{ddd}\text{-}\mathit{AES}$ and $\mathit{bbb}\text{-}\mathit{ddd}\text{-}\mathit{AES}$, where special attentions is given to how the nonce and associated data can be processed. We prove that this mode is secure in the nonce-respecting setting, in the nonce-misuse setting, as well as in the setting where random nonces are used.

SQIAsignHD: SQIsignHD Adaptor Signature

Adaptor signatures can be viewed as a generalized form of the standard digital signature schemes where a secret randomness is hidden within a signature. Adaptor signatures are a recent cryptographic primitive and are becoming an important tool for blockchain applications such as cryptocurrencies to reduce on-chain costs, improve fungibility, and contribute to off-chain forms of payment in payment-channel networks, payment-channel hubs, and atomic swaps. However, currently used adaptor signature constructions are vulnerable to quantum adversaries due to Shor’s algorithm. In this work, we introduce SQIAsignHD, a new quantum-resistant adaptor signature scheme based on isogenies of supersingular elliptic curves, using SQIsignHD - as the underlying signature scheme - and exploiting the idea of the artificial orientation on the supersingular isogeny Diffie-Hellman key exchange protocol, SIDH, as the underlying hard relation. We, furthermore, show that our scheme is secure in the Quantum Random Oracle Model (QROM).

Two-Party Decision Tree Training from Updatable Order-Revealing Encryption

Running machine learning algorithms on encrypted data is a way forward to marry functionality needs common in industry with the important concerns for privacy when working with potentially sensitive data. While there is already a growing field on this topic and a variety of protocols, mostly employing fully homomorphic encryption or performing secure multiparty computation (MPC), we are the first to propose a protocol that makes use of a specialized encryption scheme that allows to do secure comparisons on ciphertexts and update these ciphertexts to be encryptions of the same plaintexts but under a new key. We call this notion Updatable Order-Revealing Encryption (uORE) and provide a secure construction using a key-homomorphic pseudorandom function.
In a second step, we use this scheme to construct an efficient three-round protocol between two parties to compute a decision tree (or forest) on labeled data provided by both parties. The protocol is in the passively-secure setting and has some leakage on the data that arises from the comparison function on the ciphertexts. We motivate how our protocol can be compiled into an actively-secure protocol with less leakage using secure enclaves, in a graceful degradation manner, e.g. falling back to the uORE leakage, if the enclave becomes fully transparent. We also analyze the leakage of this approach, giving an upper bound on the leaked information. Analyzing the performance of our protocol shows that this approach allows us to be much more efficient (especially w.r.t. the number of rounds) than current MPC-based approaches, hence allowing for an interesting trade-off between security and performance.

Convolution-Friendly Image Compression in FHE

Fully Homomorphic Encryption (FHE) is a powerful tool that brings privacy and security to all sorts of applications by allowing us to perform additions and multiplications directly on ciphertexts without the need of the secret key.
Some applications of FHE that were previously overlooked but have recently been gaining traction are data compression and image processing.
Practically, FHE enables applications such as private satellite searching,
private object recognition, or even encrypted video editing.
We propose a practical FHE-friendly image compression and processing pipeline where an image can be compressed and encrypted on the client-side, sent to a server which decompresses it homomorphically and then performs image processing in the encrypted domain before returning the encrypted result to the client.
Inspired by JPEG, our pipeline also relies on discrete cosine transforms
and quantization to simplify the representation of an image in the frequency domain, making it possible to effectively use a compression algorithm.
This pipeline is designed to be compatible with existing image-processing techniques in FHE, such as pixel-wise processing and convolutional filters.
Using this technique, a high-definition ($1024\times1024$) image can be homomorphically decompressed, processed with a convolutional filter and re-compressed in under $24.7$s, while using ~8GB memory.

Crypto Dark Matter on the Torus: Oblivious PRFs from shallow PRFs and FHE

Partially Oblivious Pseudorandom Functions (POPRFs) are 2-party protocols that allow a client to learn pseudorandom function (PRF) evaluations on inputs of its choice from a server. The client submits two inputs, one public and one private. The security properties ensure that the server cannot learn the private input, and the client cannot learn more than one evaluation per POPRF query. POPRFs have many applications including password-based key exchange and privacy-preserving authentication mechanisms. However, most constructions are based on classical assumptions, and those with post quantum security suffer from large eﬀiciency drawbacks.
In this work, we construct a novel POPRF from lattice assumptions and the “Crypto Dark Matter” PRF candidate (TCC’18) in the random oracle model. At a conceptual level, our scheme exploits the alignment of this family of PRF candidates, relying on mixed modulus computations, and programmable bootstrapping in the torus fully homomorphic encryption scheme (TFHE). We show that our construction achieves malicious client security based on circuit-private FHE, and client privacy from the semantic security of the FHE scheme. We further explore a heuristic approach to extend our scheme to support verifiability, based on the difficulty of computing cheating circuits in low depth. This would yield a verifiable (P)OPRF. We provide a proof-of-concept implementation and preliminary benchmarks of our construction. For the core online OPRF functionality, we require amortised 10.0KB communication per evaluation and a one-time per-client setup communication of 2.5MB.

PLASMA: Private, Lightweight Aggregated Statistics against Malicious Adversaries

Private heavy-hitters is a data-collection task where multiple clients possess private bit strings, and data-collection servers aim to identify the most popular strings without learning anything about the clients' inputs. In this work, we introduce PLASMA: a private analytics framework in the three-server setting that protects the privacy of honest clients and the correctness of the protocol against a coalition of malicious clients and a malicious server.
Our core primitives are a verifiable incremental distributed point function (VIDPF) and a batched consistency check, which are of independent interest. Our VIDPF introduces new methods to validate client inputs based on hashing. Meanwhile, our batched consistency check uses Merkle trees to validate multiple client sessions together in a batch. This drastically reduces server communication across multiple client sessions, resulting in significantly less communication compared to related works. Finally, we compare PLASMA with the recent works of Asharov et al. (CCS'22) and Poplar (S&P'21) and compare in terms of monetary cost for different input sizes.

A provably masked implementation of BIKE Key Encapsulation Mechanism

BIKE is a post-quantum key encapsulation mechanism (KEM) selected for the 4th round of the NIST's standardization campaign. It relies on the hardness of the syndrome decoding problem for quasi-cyclic codes and on the indistinguishability of the public key from a random element, and provides the most competitive performance among round 4 candidates, which makes it relevant for future real-world use cases. Analyzing its side-channel resistance has been highly encouraged by the community and several works have already outlined various side-channel weaknesses and proposed ad-hoc countermeasures. However, in contrast to the well-documented research line on masking lattice-based algorithms, the possibility of generically protecting code-based algorithms by masking has only been marginally studied in a 2016 paper by Chen et al. in SAC 2015.
At this stage of the standardization campaign, it is important to assess the possibility of fully masking BIKE scheme and the resulting cost in terms of performances.
In this work, we provide the first high-order masked implementation of a code-based algorithm. We had to tackle many issues such as finding proper ways to handle large sparse polynomials, masking the key-generation algorithm or keeping the benefit of the bitslicing. In this paper, we present all the gadgets necessary to provide a fully masked implementation of BIKE, we discuss our different implementation choices and we propose a full proof of masking in the Ishai Sahai and Wagner (Crypto 2003) model.
More practically, we also provide an open C-code masked implementation of the key-generation, encapsulation and decapsulation algorithms with extensive benchmarks. While the obtained performance is slower than existing masked lattice-based algorithms, we show that masking at order 1, 2, 3, 4 and 5 implies a performance penalty of x5.8, x14.2, x24.4, x38 and x55.6 compared to order 0 (unmasked and unoptimized BIKE). This scaling is encouraging and no Boolean to Arithmetic conversion has been used.

BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers

The advent of transformers has brought about significant advancements in traditional machine learning tasks. However, their pervasive deployment has raised concerns about the potential leakage of sensitive information during inference. Existing approaches using secure multiparty computation (MPC) face limitations when applied to transformers due to the extensive model size and resource-intensive matrix-matrix multiplications. In this paper, we present BOLT, a privacy-preserving inference framework for transformer models that supports efficient matrix multiplications and nonlinear computations. Combined with our novel machine learning optimizations, BOLT reduces the communication cost by 10.91x. Our evaluation on diverse datasets demonstrates that BOLT maintains comparable accuracy to floating-point models and achieves 4.8-9.5x faster inference across various network settings compared to the state-of-the-art system.

Scoring the predictions: a way to improve profiling side-channel attacks

Side-channel analysis is an important part of the security evaluations of hardware components and more specifically of those that include cryptographic algorithms. Profiling attacks are among the most powerful attacks as they assume the attacker has access to a clone device of the one under attack. Using the clone device allows the attacker to make a profile of physical leakages linked to the execution of algorithms. This work focuses on the characteristics of this profile and the information that can be extracted from its application to the attack traces. More specifically, looking at unsuccessful attacks, it shows that by carefully ordering the attack traces used and limiting their number, better results can be achieved with the same profile. Using this method allows us to consider the classical attack method, i.e. where the traces are randomly ordered, as the worst case scenario. The best case scenario is when the attacker is able to successfully order and select the best attack traces. A method for identifying efficient ordering when using deep learning models as profiles is also provided. A new loss function "Scoring loss" is dedicated to training machine learning models that give a score to the attack prediction and the score can be used to order the predictions.

Permutation-Based Hash Chains with Application to Password Hashing

Hash chain based password systems are a useful way to guarantee authentication with one-time passwords. The core idea is specified in RFC 1760 as S/Key. At CCS 2017, Kogan et al. introduced T/Key, an improved password system where one-time passwords are only valid for a limited time period. They proved security of their construction in the random oracle model under a basic modeling of the adversary. In this work, we make various advances in the analysis and instantiation of hash chain based password systems. Firstly, we describe a slight generalization called U/Key that allows for more flexibility in the instantiation and analysis, and we develop a security model that refines the adversarial strength into offline and online complexity, that can be used beyond the random oracle model, and that allows to argue multi-user security directly. Secondly, we derive a new security proof of U/Key in the random oracle model, as well as dedicated and tighter security proofs of U/Key instantiated with a sponge construction and a truncated permutation. When applied to T/Key, these results improve significantly over the earlier results: whereas the originally suggested instantiation using SHA-256 achieved 128 bits of security using a hash function with a state size of 384 bits, with a truncated permutation construction one can achieve 128 bits of security already with a state size of 256 bits.

OpenPubkey: Augmenting OpenID Connect with User held Signing Keys

OpenPubkey makes a client-side modification to OpenID Connect so that an ID Token issued by an OpenID Provider commits to a user held public key. This transforms an ID Token into a certificate that cryptographically binds an OpenID Connect identity to a public key. We call such an ID Token, a PK Token. The user can then sign messages with their signing key and these signatures can be authenticated and attributed to the user’s OpenID Connect identity. This allows OpenPubkey to upgrade OpenID Connect from Bearer Authentication to Proof-of-Possession, eliminating trust assumptions in OpenID Connect and defeating entire categories of attacks present in OpenID Connect. OpenPubkey was designed to satisfy a decade-long need for this functionality. Prior to OpenPubkey, OpenID Connect did not have a secure way for users to sign statements under their OpenID identities.
OpenPubkey is transparent to users and OpenID Providers. An OpenID Provider can not even determine that OpenPubkey is being used. This makes OpenPubkey fully compatible with existing OpenID Providers. In fact a variant of OpenPubkey is currently deployed and used to authenticate signed messages and identities for users with accounts on Google, Microsoft, Okta, and Onelogin. OpenPubkey does not add new trusted parties to OpenID Connect and reduces preexisting trust assumptions. If used in tandem with our MFA-cosigner, OpenPubkey can maintain security even against a malicious OpenID Provider (the most trusted party in OpenID Connect).

Some notes on algorithms for abelian varieties

A compilation of various notes written over the years on some algorithms on abelian varieties.

The geometric interpretation of the Tate pairing and its applications

While the Weil pairing is geometric, the Tate pairing is arithmetic: its value depends on the base field considered. Nevertheless, the étale topology allows to interpret the Galois action in a geometric manner. In this paper, we discuss this point of view for the Tate pairing: its natural geometric interpretation is that it gives étale $\mu_n$-torsors. While well known to experts, this interpretation is perhaps less known in the cryptographic community.
As an application, we explain how to use the Tate pairing to study the fibers of an isogeny, and we prove a conjecture by Castryck and Decru on multiradical isogenies.

Menhir: An Oblivious Database with Protection against Access and Volume Pattern Leakage

Analyzing user data while protecting the privacy of individuals remains a big challenge. Trusted execution environments (TEEs) are a possible solution as they protect processes and Virtual Machines (VMs) against malicious hosts. However, TEEs can leak access patterns to code and to the data being processed. Furthermore, when data is stored in a TEE database, the data volume required to answer a query is another unwanted side channel that contains sensitive information. Both types of information leaks, access patterns and volume patterns, allow for database reconstruction attacks.
In this paper, we present Menhir, an oblivious TEE database that hides access patterns with ORAM guarantees and volume patterns through differential privacy. The database allows range and point queries with SQL-like WHERE-clauses. It builds on the state-of-the-art oblivious AVL tree construction Oblix (S&P'18), which by itself does not protect against volume leakage. We show how volume leakage can be exploited in range queries and improve the construction to mitigate this type of attack. We prove the correctness and obliviousness of Menhir. Our evaluation shows that our approach is feasible and scales well with the number of rows and columns in the database.

Threshold implementations of cryptographic functions between finite Abelian groups

The threshold implementation technique has been proposed in 2006 by Nikova et al. as a countermeasure to mitigate cryptographic side-channel attacks on hardware implementations when the effect of glitches is taken into account. This technique is based on Boolean sharing (also called masking) and it was developed for securing symmetric ciphers such as AES. In 2023, Piccione et al. proposed a general construction of threshold implementations that is universal for S-boxes that are bijective vectorial Boolean function (functions from a binary vector space $\mathbb{F}_{2}^n$ into itself). In this paper, we further generalize the construction and we propose a general theory of threshold implementations for any type of S-boxes. We investigate the case of functions (also not necessarily bijective) that are defined between two finite Abelian groups and we use the definition of threshold implementation given by Dhooghe et al. in 2019 with additive sharing. To show that this generalized notion is as useful as the one for Boolean sharing, we prove that many classical results still hold. An important tool in this theory is the notion of functional degree introduced by Aichinger and Moosbauer in 2021 which generalizes the algebraic degree of a vectorial Boolean function. We show that if $F$ has functional degree (at most) $d$ and the cardinality of the domain is divisible by the cardinality of the codomain, then $F$ admits a threshold implementation $\mathcal{F}$ with $s\geq d+2$ shares in input and $d+2$ shares in output. Moreover, we provide a complete overview on which are the available tools for studying the functional degree and how to represent those functions using a Integer-Valued (IV) polynomial representation. Then we apply our theory for the following applications: defining the inner product masking in our setting, providing a threshold implementation of any multiplication map, and computing the functional degree and the IV polynomial representations of the conversion maps between $\mathbb{F}_p^n$ and $\mathbb{Z}_{p^n}$.

Quantum Algorithms for Lattice Problems

Uncategorized

Uncategorized

We show a polynomial time quantum algorithm for solving the learning with errors problem (LWE) with certain polynomial modulus-noise ratios. Combining with the reductions from lattice problems to LWE shown by Regev [J.ACM 2009], we obtain polynomial time quantum algorithms for solving the decisional shortest vector problem (GapSVP) and the shortest independent vector problem (SIVP) for all $n$-dimensional lattices within approximation factors of $\tilde{\Omega}(n^{4.5})$. Previously, no polynomial or even subexponential time quantum algorithms were known for solving GapSVP or SIVP for all lattices within any polynomial approximation factors.
To develop a quantum algorithm for solving LWE, we mainly introduce two new techniques. First, we introduce Gaussian functions with complex variances in the design of quantum algorithms. In particular, we exploit the feature of the Karst wave in the discrete Fourier transform of complex Gaussian functions. Second, we use windowed quantum Fourier transform with complex Gaussian windows, which allows us to combine the information from both time and frequency domains. Using those techniques, we first convert the LWE instance into quantum states with purely imaginary Gaussian amplitudes, then convert purely imaginary Gaussian states into classical linear equations over the LWE secret and error terms, and finally solve the linear system of equations using Gaussian elimination. This gives a polynomial time quantum algorithm for solving LWE.

Fast pairings via biextensions and cubical arithmetic

Biextensions associated to line bundles on abelian varieties allows to reinterpret the usual Weil, Tate, Ate, optimal Ate, \ldots, pairings as monodromy pairings. We introduce a cubical arithmetic, derived from the canonical cubical torsor structure of these line bundles, to obtain an efficient arithmetic of these biextensions.
This unifies and extends Miller's standard algorithm to compute pairings along with other algorithms like elliptic nets and theta functions, and allows to adapt these algorithms to pairings on any model of abelian varieties with a polarisation $\Phi_D$, as long as we have an explicit theorem of the square for $D$.
In particular, we give explicit formulas for the arithmetic of the biextension (and cubical torsor structure) associated to the divisor $D=2(0_E)$ on an elliptic curve. We derive very efficient pairing formulas on elliptic curves and Kummer lines. Notably for generic pairings on Montgomery curves, our cubical biextension ladder algorithm to compute pairings costs only $15M$ by bits, which as far as I know is faster than any pairing doubling formula in the literature.

Efficient Linkable Ring Signatures: New Framework and Post-Quantum Instantiations

In this paper, we introduce a new framework for constructing linkable ring signatures (LRS). Our framework is based purely on signatures of knowledge (SoK) which allows one to issue signatures on behalf of any NP-statement using the corresponding witness. Our framework enjoys the following advantages: (1) the security of the resulting LRS depends only on the security of the underlying SoK; (2) the resulting LRS naturally supports online/offline signing (resp. verification), where the output of the offline signing (resp. verification) can be re-used across signatures of the same ring. For a ring size $n$, our framework requires a SoK of the NP statement with size $\log n$.
To instantiate our framework, we adapt the well-known post-quantum secure non-interactive argument of knowledge (NIAoK), ethSTARK, into an SoK. This SoK inherents the post-quantum security and has a signature size poly-logarithmic in the size of the NP statement. Thus, our resulting LRS has a signature size of $O(\text{polylog}(\log n))$. By comparison, existing post-quantum ring signatures, regardless of linkability considerations, have signature sizes of $O(\log n)$ at best. Furthermore, leveraging online/offline verification, part of the verification of signatures on the same ring can be shared, resulting in a state-of-the-art amortized verification cost of $O(\text{polylog}(\log n))$.
Our LRS also performs favourably against existing schemes in practical scenarios. Concretely, our scheme has the smallest signature size among all post-quantum ring signatures for any ring size larger than $32$. In our experiment, at $128$-bit security and ring size of $1024$, our LRS has a size of $29$KB, and an amortized verification cost of $0.3$ ms, surpassing the state-of-the-art by a significant margin. Even without considering amortization, the verification time for a single signature is $128$ ms, which is still 10x better than state-of-the-art succinct construction, marking it comparable to those featuring linear signature size. A similar performance advantage can also be seen at signing.

Families of prime-order endomorphism-equipped embedded curves on pairing-friendly curves

This paper presents a procedure to construct parameterized families
of prime-order endomorphism-equipped elliptic curves that are defined over the
scalar field of pairing-friendly elliptic curve families such as Barreto–Lynn–Scott
(BLS), Barreto–Naehrig (BN) and Kachisa–Schaefer–Scott (KSS), providing general
formulas derived from the curves’ seeds. These so-called “embedded curves” are of
major interest in SNARK applications that prove statements involving elliptic curve
arithmetic i.e. digital signatures. In this paper, the mathematical groundwork is laid,
and advantages of these embeddings are discussed. Additionally, practical examples
are included at the end.

X-Cipher: Achieving Data Resiliency in Homomorphic Ciphertexts

Homomorphic encryption (HE) allows for computations over ciphertexts while they are encrypted. Because of this, HE supports the outsourcing of computation on private data. Due to the additional risks caused by data outsourcing, the ability to recover from losses
is essential, but doing so on data encrypted under an HE scheme introduces additional challenges for recovery and usability. This work introduces X-Cipher, which aims to make HE ciphertexts resilient by ensuring they are private and recoverable simultaneously at all stages during data outsourcing.
X-Cipher allows data recovery without requiring the decryption of HE ciphertexts and maintains its ability to recover and keep data private when a cluster server has been compromised. X-Cipher allows for reduced ciphertext storage overhead by introducing novel encoding and leveraging previously introduced ciphertext packing. X-Cipher's capabilities were evaluated on a synthetic dataset to demonstrate that X-Cipher enables secure availability capabilities while enabling privacy-preserving outsourced computations.

Insights from building a blockchain-based metaverse

This paper presents an in-depth exploration of the development and deployment of a Layer 1 (L1) blockchain designed to underpin metaverse experiences. As the digital and physical realms become increasingly intertwined, the metaverse emerges as a frontier for innovation, demanding robust, scalable, and secure infrastructure. The core of our investigation centers around the challenges and insights gained from constructing a blockchain framework capable of supporting the vast, dynamic environments of the metaverse. Through the development process, we identified key areas of focus: interoperability, performance and scalability, cost, identity, privacy, security, and accessibility.
Our findings indicate that most challenges can be effectively addressed through the implementation of cryptography and subnets (i.e., Avalanche architecture), which allow for segmented, optimized environments within the broader metaverse ecosystem. This approach not only enhances performance but also provides a flexible framework for managing the diverse needs of metaverse applications.

Probabilistic Algorithms with applications to countering Fault Attacks on Lattice based Post-Quantum Cryptography

Fault attacks that exploit the propagation of effective/ineffective faults present a richer attack surface than Differential Fault Attacks, in the sense that the adversary depends on a single bit of information to eventually leak secret cryptographic material. In the recent past, a number of propagation-based fault attacks on Lattice-based Key Encapsulation Mechanisms have been proposed; many of which have no known countermeasures. In this work, we propose an orthogonal countermeasure principle that does not follow adhoc strategies (like shuffling operations on secret coefficients), but rather depends on cryptographically-backed guarantees to provide quantifiable defence against aforementioned fault attacks. Concretely, we propose a framework that uses rejection sampling (which has been traditionally used as alternatives to trapdoors) to convert otherwise deterministic algorithms to probabilistic ones. Our specific goals allow careful selection of distributions such that our framework functions with a constant number of retries (around $2-3$) for unfaulted executions. In other words, should a fault be injected, the probability of success is negligible; for correct execution however, the probability of success is overwhelmingly high. Using our framework, we hence enable probabilistic decryptions in Kyber, NewHope, and Masked Kyber, and completely cut-off fault propagation in known attacks on these constructions, allowing a sound defence against known fault attacks in literature.

Fast Parallelizable Misuse-Resistant Authenticated Encryption: Low Latency (Decryption-Fast) SIV

MRAE security is an important goal for many AEAD applications where the nonce uniqueness cannot be maintained and security risks are significant. However, MRAE schemes can be quite expensive. Two of the SoTA MRAE-secure schemes; Deoxys-II and AES-GCM-SIV rely on internal parallelism and special instructions to achieve competitive performance. However, they both suffer from the same bottleneck, they have at least one call to the underlying primitive that cannot be parallelized to any other call. Romulus-M and LMDAE are two other more recent MRAE secure schemes based on TBCs that target low area hardware. However, they are unparallelizable so they are slower than their counterparts.
In this paper, we present two new AEAD modes and four instantiations based on Tweakable Block Ciphers. These new modes target equipping high-speed applications on parallel platforms with nonce misuse resistant AEAD (MRAE). The first mode, LLSIV, targets similar performance on single-core platforms to SCT-2, while eliminating the bottlenecks that make SCT-2 not fully parallelizable. The enhanced parallelism allows LLSIV to encrypt significantly more blocks on parallel platforms, compared to SCT-2, in the same amount of time. LLSIV is based on the NaT MAC, where each ciphertext block can itself be viewed as an instance of NaT when the plaintext is prepended with $0^n$. The trade-off is that LLSIV requires the inverse function of the TBC. However, the inverse function is used only once per message and we demonstrate that for parallel implementations it represents a very small overhead.
We give an instantiation of LLSIV based on the SKINNY-128-384 TBC, and a pruned scheme, dubbed pLLSIV, which targets enhanced performance compared both SCT-2 and LLSIV on all platforms, while having reduced security claims. It relies on the recently popularized prove-then-prune methodology to take full advantage of the properties of LLSIV. This leads to a significant performance improvement, making pLLSIV even faster than online TBC-based schemes that are not MRAE-secure. Last but not least, we give an instantiation that uses the primitives used in AES-GCM-SIV: the PolyVal hash function and AES. Our instantiation is faster than AES-GCM-SIV on all platforms and have better bounds. On the other hand, it relies on the ideal cipher model as it uses the ICE TBC proposed as part of the Remus AEAD design.
The second mode we describe is LLDFV. It uses ideas from LLSIV combined the Decryption-Fast SIV (DFV) framework proposed recently by Minematsu. The goal is to reduce the number of calls to the TBC by one, while making the scheme as parallelizable as LLSIV. This makes the scheme faster that DFV on all platforms.

Towards Practical Transciphering for FHE with Setup Independent of the Plaintext Space

Fully Homomorphic Encryption (FHE) is a powerful tool to achieve non-interactive privacy preserving protocols with optimal computation/communication complexity. However, the main disadvantage is that the actual communication cost (bandwidth) is high due to the large size of FHE ciphertexts. As a solution, a technique called transciphering (also known as Hybrid Homomorphic Encryption) was introduced to achieve almost optimal bandwidth for such protocols. However, all of existing works require clients to fix a precision for the messages or a mathematical structure for the message space beforehand. It results in unwanted constraints on the plaintext size or underlying structure of FHE based applications.
In this article, we introduce a new approach for transciphering which does not require fixed message precision decided by the client, for the first time. In more detail, a client uses any kind of FHE-friendly symmetric cipher for $\{0,1\}$ to send its input data encrypted bit-by-bit, then the server can choose a precision $p$ depending on the application and homomorphically transforms the encrypted bits into FHE ciphertexts encrypting integers in $\mathbb{Z}_p$. To illustrate our new technique, we evaluate a transciphering using FiLIP cipher and adapt the most practical homomorphic evaluation technique [CCS'22] to keep the practical latency. As a result, our proof-of-concept implementation for $p$ from $2^2$ to $2^8$ takes only from $13$ ms to $137$ ms.

Panacea: Non-interactive and Stateless Oblivious RAM

Oblivious RAM (ORAM) allows a client to outsource storage to a
remote server while hiding the data access pattern from the server. Many ORAM designs have been proposed to reduce the computational overhead and bandwidth blowup for the client. A recent work, Onion Ring ORAM (CCS'19), is able to achieve $O(1)$ bandwidth blowup in the online phase using fully homomorphic encryption (FHE) techniques, at the cost of a computationally expensive client-side offline phase. Furthermore, such a scheme can be categorized as a stateful construction, meaning that the client has to locally maintain a dynamic state representing the order of remote database elements.
We present Panacea: a novel design of ORAM based on FHE techniques, which is non-interactive and stateless, achieves $O(1)$ bandwidth blowup, and does not require an expensive offline phase for the client to perform; in that sense, our design is the first of its kind among other ORAM designs. To provide the client with such performance benefits, our design delegates all expensive computation to the resourceful server.
We additionally show how to boost the server performance significantly using probabilistic batch codes at the cost of only 1.5x in additional bandwidth blowup and 3x expansion in server storage, but less amortized bandwidth.
Our experimental results show that our design, with the batching technique, is practical in terms of server computation overhead as well.
Specifically, for a database size of $2^{19}$, it takes only $1.16$ seconds of amortized computation time for a server to respond to a query.
As a result of the statelessness and low computational overhead on the client, and reasonable computational overhead on the server,
our design is very suitable to be deployed as a cloud-based privacy-preserving storage outsourcing solution with a portable client running on a lightweight device.

Integral Attack on the Full FUTURE Block Cipher

FUTURE is a recently proposed lightweight block cipher that achieved a remarkable hardware performance due to careful design decisions. FUTURE is an Advanced Encryption Standard (AES)-like Substitution-Permutation Network (SPN) with 10 rounds, whose round function consists of four components, i.e., SubCell, MixColumn, ShiftRow and AddRoundKey. Unlike AES, it is a 64-bit-size block cipher with a 128-bit secret key, and the state can be arranged into 16 cells. Therefore, the operations of FUTURE including its S-box is defined over $\mathbb{F}_2^4$. The previous studies have shown that the integral properties of 4-bit S-boxes are usually weaker than larger-size S-boxes, thus the number of rounds of FUTURE, i.e., 10 rounds only, might be too aggressive to provide enough resistance against integral cryptanalysis.
In this paper, we mount the integral cryptanalysis on FUTURE. With state-of-the-art detection techniques, we identify several integral distinguishers of 7 rounds of FUTURE. By extending this 7-round distinguisher by 3 forward rounds, we manage to recover all the 128 bits secret keys from the full FUTURE cipher without the full codebook for the first time. To further achieve better time complexity, we also present a key recovery attack on full FUTURE with full codebook. Both attacks have better time complexity than existing results.

Kirby: A Robust Permutation-Based PRF Construction

We present a construction, called Kirby, for building a variable-input-length pseudorandom function (VIL-PRF) from a $b$-bit permutation. For this construction we prove a tight bound of $b/2$ bits of security on the PRF distinguishing advantage in the random permutation model and in the multi-user setting. Similar to full-state keyed sponge/duplex, it supports full-state absorbing and additionally supports full-state squeezing, while the sponge/duplex can squeeze at most $b-c$ bits per permutation call, for a security level of $c$ bits. This advantage is especially relevant on constrained platforms when using a permutation with small width $b$. For instance, for $b=256$ at equal security strength the squeezing rate of Kirby is twice that of keyed sponge/duplex. This construction could be seen as a generalization of the construction underlying the stream cipher family Salsa. Furthermore, we define a simple mode on top of Kirby that turns it into a deck function with parallel expansion. This is similar to Farfalle but it has a much smaller memory footprint. Furthermore we prove that in the Kirby construction, the leakage of intermediate states does not allow recovering the key or earlier states.

Single Trace is All It Takes: Efficient Side-channel Attack on Dilithium

As the National Institute of Standards and Technology (NIST) concludes its post-quantum cryptography (PQC) competition, the winning algorithm, Dilithium, enters the deployment phase in 2024. This phase underscores the importance of conducting thorough practical security evaluations. Our study offers an in-depth side-channel analysis of Dilithium, showcasing the ability to recover the complete private key, ${s}_1$, within ten minutes using just two signatures and achieving a 60% success rate with a single signature. We focus on analyzing the polynomial addition in Dilithium, $z=y+{cs}_1$, by breaking down the attack into two main phases: the recovery of $y$ and ${cs}_1$ through side-channel attacks, followed by the resolution of a system of error-prone equations related to ${cs}_1$. Employing Linear Regression-based profiled attacks enables the successful recovery of the full $y$ value with a 40% success rate without the necessity for initial filtering. The extraction of ${cs}_1$ is further improved using a CNN model, which boasts an average success rate of 75%. A significant innovation of our research is the development of a constrained optimization-based residual analysis technique. This method efficiently recovers ${s}_1$ from a large set of error-containing equations concerning ${cs}_1$, proving effective even when only 10% of the equations are accurate. We conduct a practical attack on the Dilithium2 implementation on an STM32F4 platform, demonstrating that typically two signatures are sufficient for complete private key recovery, with a single signature sufficing in optimal conditions. Using a general-purpose PC, the full private key can be reconstructed in ten minutes.

Efficient isochronous fixed-weight sampling with applications to NTRU

We present a solution to the open problem of designing an efficient, unbiased and timing attack-resistant shuffling algorithm for NTRU fixed-weight sampling. Although it can be implemented without timing leakages of secret data in any architecture, we illustrate with ARMv7-M and ARMv8-A implementations; for the latter, we take advantage of architectural features such as NEON and conditional instructions, which are representative of features available on architectures targeting similar systems, such as Intel. Our proposed algorithm improves asymptotically upon the current approach, which is based on constant-time sorting networks ($O(n)$ versus $O(n \log^2 n)$), and an implementation of the new algorithm is also faster in practice, by a factor of up to $6.91\ (591\%)$ on ARMv8-A cores and $12.58\ (1158\%)$ on the Cortex-M4; it also requires fewer uniform random bits. This translates into performance improvements for NTRU encapsulation, compared to state-of-the-art implementations, of up to 50% on ARMv8-A cores and 71% on the Cortex-M4, and small improvements to key generation (up to 2.7% on ARMv8-A cores and 6.1% on the Cortex-M4), with negligible impact on code size and a slight improvement in RAM usage for the Cortex-M4.

Fast polynomial multiplication using matrix multiplication accelerators with applications to NTRU on Apple M1/M3 SoCs

Efficient polynomial multiplication routines are critical to the performance of lattice-based post-quantum cryptography (PQC). As PQC standards only recently started to emerge, CPUs still lack specialized instructions to accelerate such routines. Meanwhile, deep learning has grown immeasurably in importance. Its workloads call for teraflops-level of processing power for linear algebra operations, mainly matrix multiplication. Computer architects have responded by introducing ISA extensions, coprocessors and special-purpose cores to accelerate such operations. In particular, Apple ships an undocumented matrix-multiplication coprocessor, AMX, in hundreds of millions of mobile phones, tablets and personal computers. Our work repurposes AMX to implement polynomial multiplication and applies it to the NTRU cryptosystem, setting new speed records on the Apple M1 and M3 systems-on-chip (SoCs): polynomial multiplication, key generation, encapsulation and decapsulation are sped up by $1.54$-$3.07\times$, $1.08$-$1.33\times$, $1.11$-$1.50\times$ and $1.20$-$1.98\times$, respectively, over the previous state-of-the-art.

Ordering Transactions with Bounded Unfairness: Definitions, Complexity and Constructions

An important consideration in the context of distributed ledger protocols is fairness in terms of transaction ordering. Recent work [Crypto 2020] revealed a connection of (receiver) order fairness to social choice theory and related impossibility results arising from the Condorcet paradox. As a result of the impossibility, various relaxations of order fairness were proposed in prior works. Given that distributed ledger protocols, especially those processing smart contracts, must serialize the input transactions, a natural objective is to minimize the distance (in terms of number of transactions) between any pair of unfairly ordered transactions in the output ledger — a concept we call bounded unfairness. In state machine replication (SMR) parlance this asks for minimizing the number of unfair state updates occurring before the processing of any request. This unfairness minimization objective gives rise to a natural class of parametric order fairness definitions that has not been studied before. As we observe, previous realizable relaxations of order fairness do not yield good unfairness bounds.
Achieving optimal order fairness in the sense of bounded unfairness turns out to be connected to the graph theoretic properties of the underlying transaction dependency graph and specifically the bandwidth metric of strongly connected components in this graph. This gives rise to a specific instance of the definition that we call “directed bandwidth order-fairness” which we show that it captures the best possible that any ledger protocol can achieve in terms of bounding unfairness. We prove ordering transactions in this fashion is NP-hard and non-approximable for any constant ratio. Towards realizing the property, we put forth a new distributed ledger protocol called Taxis that achieves directed bandwidth order-fairness. We present two variations, one that matches the property perfectly but (necessarily) lacks in performance and liveness, and another that achieves liveness and better complexity while offering a slightly relaxed version of the property. Finally, we comment on applications of our work to social choice, a direction which we believe to be of independent interest.

Efficient Permutation Correlations and Batched Random Access for Two-Party Computation

In this work we define the notion of a permutation correlation $(\pi,A,B,C)$ s.t. $\pi(A)=B+C$ for a random permutation $\pi$ of $n$ elements and vectors $A,B,C\in \mathbb{F}^n$. We demonstrate the utility of this correlation for a wide range of applications. The correlation can be derandomized to obliviously shuffle a secret-shared list, permute a secret-shared list by a secret-shared permutation, and more. Similar techniques have emerged as a popular building block for the honest majority protocols when efficient batched random access is required, e.g. collaborative filtering, sorting, database joins, graph algorithms, and many more. We present the highly flexible notion of permutation correlation and argue that it should be viewed as a first class primitive in the MPC practitioner's toolbox.
We give two novel protocols for efficiently generating a random permutation correlation. The first makes use of recent advances in MPC-friendly PRFs to obtain a protocol requiring $O(n\ell)$ OTs/time and constant rounds to permute $n$ $\ell$-bit strings. Unlike the modern OT extension techniques we rely on, this was previously only achievable from relatively more expensive public-key cryptography, e.g. Paillier or LWE. We implement this protocol and demonstrate that it can generate a correlation for $n=2^{20},\ell=128$ in 19 seconds and $\sim2\ell n$ communication, a 15 \& $1.1\times$ improvement over the LWE solution of Juvekar at al. (CCS 2018). The second protocol is based on pseudo-random correlation generators and achieves an overhead that is \emph{sublinear} in the string length $\ell$, i.e. the communication and number of OTs is $O(n\log \ell)$. The latter protocol is ideal for the setting when you need to repeatedly permute secret-shared data by the same permutation, e.g. in graph algorithms.
Finally, we present a suite of highly efficient protocols for performing various batched random access operations. These include a class of protocols we refer to as \emph{extraction}, which allow a user to \emph{mark} a subset of $X$ and have this subset obliviously extracted into an output list. Additionally, the parties can specify an \emph{arbitrary} selection function $\sigma:[n]\rightarrow[n]$ and obtain shares of $\sigma(X)=(X_{\sigma(1)},\ldots,X_{\sigma(n)})$ from $X$. We implement these protocols and report on their performance.

Share with Care: Breaking E2EE in Nextcloud

Nextcloud is a leading cloud storage platform with more than 20 million users.
Nextcloud offers an end-to-end encryption (E2EE) feature that is claimed to be able “to keep extremely sensitive data fully secure even in case of a full server breach”.
They also claim that the Nextcloud server “has Zero Knowledge, that is, never has access to any of the data or keys in unencrypted form”.
This is achieved by having encryption and decryption operations that are done using file keys that are only available to Nextcloud clients, with those file keys being protected by a key hierarchy that ultimately relies on long passphrases known exclusively to the users.
We provide the first detailed documentation and security analysis of Nextcloud's E2EE feature.
Nextcloud's strong security claims motivate conducting the analysis in the setting where the server itself is considered malicious.
We present three distinct attacks against the E2EE security guarantees in this setting.
Each one enables the confidentiality and integrity of all user files to be compromised.
All three attacks are fully practical and we have built proof-of-concept implementations for each.
The vulnerabilities make it trivial for a malicious Nextcloud server to access and manipulate users' data.
We have responsibly disclosed the three vulnerabilities to Nextcloud. The second and third vulnerabilities have been remediated. The first was addressed by temporarily disabling file sharing from the E2EE feature until a redesign of the feature can be made.
We reflect on broader lessons that can be learned for designers of E2EE systems.

The many faces of Schnorr

Recently, a number of highly optimized threshold signing protocols for Schnorr signatures have been proposed. While these proposals contain important new techniques, some of them present and analyze these techniques in very specific contexts, making it less than obvious how these techniques can be adapted to other contexts (and perhaps combined with one another). The main goal of this paper is to abstract out and extend in various ways some of these techniques, building a toolbox of techniques that can be easily combined in different ways and in different contexts. To this end, we present security results for various "enhanced" modes of attack on the Schnorr signature scheme in the non-distributed setting. These results support a modular approach to protocol design and analysis in which one reduces the security of a distributed signing protocol to such an enhanced attack mode in the non-distributed setting. We show how these results can be used to easily design new threshold Schnorr protocols that enjoy better security and/or performance properties than existing ones.

Secure Range-Searching Using Copy-And-Recurse

{\em Range searching} is the problem of preprocessing a set of points $P$, such that given a query range $\gamma$ we can efficiently compute some function $f(P\cap\gamma)$. For example, in a 1 dimensional {\em range counting} query, $P$ is a set of numbers, $\gamma$ is a segment and we need to count how many numbers of $P$ are in $\gamma$.
In higher dimensions, $P$ is a set of $d$ dimensional points and the query range is some volume in $R^d$. In general, we want to compute more than just counting, for example, the average of $P\cap\gamma$.
Range searching has applications in databases where some SELECT queries can be translated to range queries.
It had received a lot of attention in computational geometry where a data structure called {\em partition tree} was shown to solve range queries in time sub-linear in $|P|$ using space only linear in $|P|$.
In this paper we consider partition trees under FHE where we answer range queries without learning the value of the points or the parameters of the range.
We show how partition trees can be securely traversed with $O(t \cdot n^{1-\frac{1}{d}+\epsilon} + n^{1+\epsilon})$ operations, where $n=|P|$, $t$ is the number of operations needed to compare to $\gamma$ and $\epsilon>0$ is a parameter. As far as we know, this is the first non-trivial bound on range searching under FHE and it improves over the na\"ive solution that needs $O(t\cdot n)$ operations.
Our algorithms are independent of the encryption scheme but as an example we implemented them using the CKKS FHE scheme. Our experiments show that for databases of sizes $2^{23}$ and $2^{25}$, our algorithms run $\times 2.8$ and $\times 4.7$ (respectively) faster than the na\"ive algorithm.
The improvement of our algorithm comes from a method we call copy-and-recurse. With it we efficiently traverse a $r$-ary tree (where each inner node has $r$ children) that also has the property that at most $\xi$ of them need to be recursed into when traversing the tree.
We believe this method is interesting in its own and can be used to improve traversals in other tree-like structures.

Optimal Asynchronous Byzantine Consensus with Fair Separability

Despite ensuring both consistency and liveness, state machine replication protocols remain vulnerable to adversaries who manipulate the transaction order. To address this, researchers have proposed order-fairness techniques that rely either on building dependency graphs between transactions, or on assigning sequence numbers to transactions. Existing protocols that handle dependency graphs suffer from sub-optimal performance, resilience, or security.
On the other hand, Pompe (OSDI '20) introduced the novel ordering notion of ordering linearizability that uses sequence numbers. However, Pompe's ordering only applies to committed transactions, opening the door to order-fairness violation when there are network delays, and vulnerability to performance downgrade when there are Byzantine attackers. A stronger notion, fair separability, was introduced to require ordering on all observed transactions. However, no implementation of fair separability exists.
In this paper, we introduce a protocol for state machine replication with fair separability ($\mathsf{SMRFS}$); moreover, our protocol has communication complexity $\mathcal{O}(n\ell+\lambda n^2)$, where $n$ is the number of processes, $\ell$ is the input (transaction) size, and $\lambda$ is the security parameter. This is optimal when $\ell\geq \lambda n$, while previous works have cubic communication. To the best of our knowledge, $\mathsf{SMRFS}$ is the first protocol to achieve fair separability, and the first implementation of fair ordering that has optimal communication complexity and optimal Byzantine resilience.

On Sigma-Protocols and (packed) Black-Box Secret Sharing Schemes

$\Sigma$-protocols are a widely utilized, relatively simple and well understood type of zero-knowledge proofs. However, the well known Schnorr $\Sigma$-protocol for proving knowledge of discrete logarithm in a cyclic group of known prime order, and similar protocols working over this type of groups, are hard to generalize to dealing with other groups. In particular with hidden order groups, due to the inability of the knowledge extractor to invert elements modulo the order.
In this paper, we introduce a universal construction of $\Sigma$-protocols designed to prove knowledge of preimages of group homomorphisms for any abelian finite group. In order to do this, we first establish a general construction of a $\Sigma$-protocol for $\mathfrak{R}$-module homomorphism given only a linear secret sharing scheme over the ring $\mathfrak{R}$, where zero knowledge and special soundness can be related to the privacy and reconstruction properties of the secret sharing scheme. Then, we introduce a new construction of 2-out-of-$n$ packed black-box secret sharing scheme capable of sharing $k$ elements of an arbitrary (abelian, finite) group where each share consists of $k+\log n-3$ group elements.
From these two elements we obtain a generic ``batch'' $\Sigma$-protocol for proving knowledge of $k$ preimages of elements via the same group homomorphism, which communicates $k+\lambda-3$ elements of the group to achieve $2^{-\lambda}$ knowledge error.
For the case of class groups, we show that our $\Sigma$-protocol improves in several aspects on existing proofs for knowledge of discrete logarithm and other related statements that have been used in a number of works.
Finally, we extend our constructions from group homomorphisms to the case of ZK-ready functions, introduced by Cramer and Damg\aa rd in Crypto 09, which in particular include the case of proofs of knowledge of plaintext (and randomness) for some linearly homomorphic encryption schemes such as Joye-Libert encryption. However, in the case of Joye-Libert, we show an even better alternative, using Shamir secret sharing over Galois rings, which achieves $2^{-k}$ knowledge soundness by communicating $k$ ciphertexts to prove $k$ statements.

A post-quantum Distributed OPRF from the Legendre PRF

A distributed OPRF allows a client to evaluate a pseudorandom function on an input chosen by the client using a distributed key shared among multiple servers. This primitive ensures that the servers learn nothing about the input nor the output, and the client learns nothing about the key.
We present a post-quantum OPRF in a distributed server setting, which can be computed in a single round of communication between a client and the servers.
The only server-to-server communication occurs during a precomputation phase.
The algorithm is based on the Legendre PRF which can be computed from a single MPC multiplication among the servers.
To this end we propose two MPC approaches to evaluate the Legendre PRF based on replicated and optimised secret sharing, respectively. Furthermore, we propose two methods that allows us to perform MPC multiplication in an efficient way that are of independent interest.
By employing the latter, we are able to evaluate the Legendre OPRF in a fashion that is quantum secure, verifiable and secure against malicious adversaries under a threshold assumption, as well as computable in a single round of interaction.
To the best of our knowledge, our proposed distributed OPRFs are the first post-quantum secure offering such properties.
We also provide an implementation of our protocols, and benchmark it against existing OPRF constructions.

Max Attestation Matters: Making Honest Parties Lose Their Incentives in Ethereum PoS

We present staircase attack, the first attack on the incentive mechanism of the Proof-of-Stake (PoS) protocol used in Ethereum 2.0 beacon chain. Our attack targets the penalty of the incentive mechanism that penalizes inactive participation. Our attack can make honest validators suffer from penalties, even if they strictly follow the specification of the protocol. We show both theoretically and experimentally that if the adversary controls 29.6% stake in a moderate-size system, the attack can be launched continuously, so eventually all honest validators will lose their incentives. In contrast, the adversarial validators can still receive incentives, and the stake owned by the adversary can eventually exceed the $1/3$ threshold (system assumption), posing a threat to the security properties of the system.
In practice, the attack feasibility is directly related to two parameters: the number of validators and the parameter MAX_ATTESTATION, the maximum number of attestations (i.e., votes) that can be included in each block. We further modify our attack such that, with current system setup (850,000 validators and MAX_ATTESTATION=128), our attack can be launched continuously with a probability of 80.25%. As a result, the incentives any honest validator receives are only 28.9% of its fair share.

A Note on the Common Haar State Model

Common random string model is a popular model in classical cryptography with many constructions proposed in this model. We study a quantum analogue of this model called the common Haar state model, which was also studied in an independent work by Chen, Coladangelo and Sattath (arXiv 2024). In this model, every party in the cryptographic system receives many copies of one or more i.i.d Haar states.
Our main result is the construction of a statistically secure PRSG with: (a) the output length of the PRSG is strictly larger than the key size, (b) the security holds even if the adversary receives $O\left(\frac{\lambda}{(\log(\lambda))^{1.01}} \right)$ copies of the pseudorandom state. We show the optimality of our construction by showing a matching lower bound. Our construction is simple and its analysis uses elementary techniques.

Breaking Bicoptor from S$\&$P 2023 Based on Practical Secret Recovery Attack

At S$\&$P 2023, a family of secure three-party computing protocols called Bicoptor was mainly proposed by Huawei Technology in China, which is used to compute non-linear functions in privacy preserving machine learning. In these protocols, two parties $P_0, P_1$ respectively hold the corresponding shares of the secret, while a third party $P_2$ acts as an assistant. The authors claimed that neither party in the Bicoptor can independently compromise the confidentiality of the input, intermediate, or output. In this paper, we point out that this claim is incorrect. The assistant $P_2$ can recover the secret in the DReLU protocol, which is the basis of Bicoptor. The restoration of its secret will result in the security of the remaining protocols in Bicoptor being compromised. Specifically, we provide two secret recovery attacks regarding the DReLU protocol. The first attack method belongs to a clever enumeration method, which is mainly due to the derivation of the modular equation about the secret and its share. The key of the second attack lies in solving the small integer root problem of a modular equation, as the lattices involved are only 3 or 4 dimensions, the LLL algorithm can effectively work. For the system settings selected by Bicoptor, our experiment shows that the desired secret in the DReLU protocol can be restored within one second on a personal computer. Therefore, when using cryptographic protocols in the field of privacy preserving machine learning, it is not only important to pay attention to design overhead, but also to be particularly careful of potential security threats.

Dual Support Decomposition in the Head: Shorter Signatures from Rank SD and MinRank

The MPC-in-the-Head (MPCitH) paradigm is widely used for building post-quantum signature schemes, as it provides a versatile way to design proofs of knowledge based on hard problems. Over the years, the MPCitH landscape has changed significantly, with the most recent improvement coming from VOLE-in-the-Head (VOLEitH) and Threshold-Computation-in-the-Head (TCitH).
While a straightforward application of these frameworks already improve the existing MPCitH-based signatures, we show in this work that we can adapt the arithmetic constraints representing the underlying security assumptions (here called the modeling) to achieve smaller sizes using these new techniques.
More precisely, we explore existing modelings for the rank syndrome decoding (RSD) and MinRank problems and we introduce a new modeling, named dual support decomposition, which achieves better sizes with the VOLEitH and TCitH frameworks by minimizing the size of the witnesses.
While this modeling is naturally more efficient than the other ones for a large set of parameters, we show that it is possible to go even further and explore new areas of parameters. With these new modeling and parameters, we obtain low-size witnesses which drastically reduces the size of the ``arithmetic part'' of the signature.
We apply our new modeling to both TCitH and VOLEitH frameworks and compare our results to RYDE, MiRitH, and MIRA signature schemes. We obtain signature sizes below 4 kB for 128 bits of security with N=256 parties (a.k.a. leaves in the GGM trees) and going as low as $\approx$ 3.5 kB with N=2048, for both RSD and MinRank. This represents an improvement of more than 1.5 kB compared to the original submissions to the 2023 NIST call for additional signatures. We also note that recent techniques optimizing the sizes of GGM trees are applicable to our schemes and further reduce the signature sizes by a few hundred bytes, bringing them arround 3 kB (for 128 bits of security with N=2048).

Lattice-Based Timed Cryptography

Timed cryptography studies primitives that retain their security only for a predetermined amount of time, such as proofs of sequential work and time-lock puzzles. This feature has proven to be useful in a large number of practical applications, e.g. randomness generation, sealed-bid auctions, and fair multi-party computation. However, the current state of affairs in timed cryptography is unsatisfactory: Virtually all efficient constructions rely on a single sequentiality assumption, namely that repeated squaring in unknown order groups cannot be parallelised. This is a single point of failure in the classical setting and is even false against quantum adversaries.
In this work we put forward a new sequentiality assumption, which essentially says that a repeated application of the standard lattice-based hash function cannot be parallelised. We provide concrete evidence of the validity of this assumption and perform some initial cryptanalysis. We also propose a new template to construct proofs of sequential work, based on lattice techniques.

Supersingular Hashing using Lattès Maps

In this note we propose a variant (with four sub-variants) of the Charles--Goren--Lauter (CGL) hash function using Lattès maps over finite fields. These maps define dynamical systems on the projective line. The underlying idea is that these maps ``hide'' the $j$-invariants in each step in the isogeny chain, similar to the Merkle--Damgård construction. This might circumvent the problem concerning the knowledge of the starting (or ending) curve's endomorphism ring, which is known to create collisions in the CGL hash function.
Let us, already in the abstract, preface this note by remarking that we have not done any explicit computer experiments and benchmarks (apart from a small test on the speed of computing the orbits), nor do we make any security claims. Part of the reason for this is the author's lack of competence in complexity theory and evaluation of security claims. Instead this note is only meant as a presentation of the main idea, the hope being that someone more competent will find it interesting enough to pursue further.

Abuse Reporting for Metadata-Hiding Communication Based on Secret Sharing

As interest in metadata-hiding communication grows in both research and practice, a need exists for stronger abuse reporting features on metadata-hiding platforms. While message franking has been deployed on major end-to-end encrypted platforms as a lightweight and effective abuse reporting feature, there is no comparable technique for metadata-hiding platforms. Existing efforts to support abuse reporting in this setting, such as asymmetric message franking or the Hecate scheme, require order of magnitude increases in client and server computation or fundamental changes to the architecture of messaging systems. As a result, while metadata-hiding communication inches closer to practice, critical content moderation concerns remain unaddressed.
This paper demonstrates that, for broad classes of metadata-hiding schemes, lightweight abuse reporting can be deployed with minimal changes to the overall architecture of the system. Our insight is that much of the structure needed to support abuse reporting already exists in these schemes. By taking a non-generic approach, we can reuse this structure to achieve abuse reporting with minimal overhead. In particular, we show how to modify schemes based on secret sharing user inputs to support a message franking-style protocol. Compared to prior work, our shared franking technique more than halves the time to prepare a franked message and gives order of magnitude reductions in server-side message processing times, as well as in the time to decrypt a message and verify a report.

Analysing Cryptography in the Wild - A Retrospective

We reflect on our experiences analysing cryptography deployed “in the wild” and give recommendations to fellow researchers about this process.

A comment on "Comparing the MOV and FR reductions in elliptic curve cryptography" from EUROCRYPT'99

In general the discrete logarithm problem is a hard problem in the elliptic curve cryptography, and the best known solving algorithm have exponential running time. But there exists a class of curves, i.e. supersingular elliptic curves, whose discrete logarithm problem has a subexponential solving algorithm called the MOV attack. In 1999, the cost of the MOV reduction is still computationally expensive due to the power of computers. We analysis the cost of the MOV reduction and the discrete logarithm problem of the curves in \cite{HSSI99} using Magma with an ordinary computer.

Ring/Module Learning with Errors under Linear Leakage -- Hardness and Applications

This paper studies the hardness of decision Module Learning with Errors (\MLWE) under linear leakage, which has been used as a foundation to derive more efficient lattice-based zero-knowledge proofs in a recent paradigm of Lyubashevsky, Nguyen, and Seiler (PKC 21). Unlike in the plain \LWE~setting, it was unknown whether this problem remains provably hard in the module/ring setting.
This work shows a reduction from the search \MLWE~to decision \MLWE~with linear leakage. Thus, the main problem remains hard asymptotically as long as the non-leakage version of \MLWE~is hard. Additionally, we also refine the paradigm of Lyubashevsky, Nguyen, and Seiler (PKC 21) by showing a more fine-grained tradeoff between efficiency and leakage. This can lead to further optimizations of lattice proofs under the paradigm.