## Papers updated in last 183 days (1659 results)

Key-and-Signature Compact Multi-Signatures for Blockchain: A Compiler with Realizations

Multi-signature is a protocol where a set of signatures jointly sign a message so that the final signature is significantly shorter than concatenating individual signatures together. Recently, it finds applications in blockchain, where several users want to jointly authorize a payment through a multi-signature. However, in this setting, there is no centralized authority and it could suffer from a rogue key attack where the attacker can generate his own keys arbitrarily. Further, to minimize the storage on blockchain, it is desired that the aggregated public-key and the aggregated signature are both as short as possible. In this paper, we find a compiler that converts a kind of identification (ID) scheme (which we call a linear ID) to a multi-signature so that both the aggregated public-key and the aggregated signature have a size independent of the number of signers. Our compiler is provably secure. The advantage of our results is that we reduce a multi-party problem to a weakly secure two-party problem. We realize our compiler with two ID schemes. The first is Schnorr ID. The second is a new lattice-based ID scheme, which via our compiler gives the first regular lattice-based multi-signature scheme with key-and-signature compact without a restart during signing process.

Nova: Recursive Zero-Knowledge Arguments from Folding Schemes

We introduce a new approach to realize incrementally verifiable computation (IVC), in which the prover recursively proves the correct execution of incremental computations of the form $y=F^{(\ell)}(x)$, where $F$ is a (potentially non-deterministic) computation, $x$ is the input, $y$ is the output, and $\ell > 0$. Unlike prior approaches to realize IVC, our approach avoids succinct non-interactive arguments of knowledge (SNARKs) entirely and arguments of knowledge in general. Instead, we introduce and employ folding schemes, a weaker, simpler, and more efficiently-realizable primitive, which reduces the task of checking two instances in some relation to the task of checking a single instance. We construct a folding scheme for a characterization of $\mathsf{NP}$ and show that it implies an IVC scheme with improved efficiency characteristics: (1) the "recursion overhead" (i.e., the number of steps that the prover proves in addition to proving the execution of $F$) is a constant and it is dominated by two group scalar multiplications expressed as a circuit (this is the smallest recursion overhead in the literature), and (2) the prover's work at each step is dominated by two multiexponentiations of size $O(|F|)$, providing the fastest prover in the literature. The size of a proof is $O(|F|)$ group elements, but we show that using a variant of an existing zkSNARK, the prover can prove the knowledge of a valid proof succinctly and in zero-knowledge with $O(\log{|F|})$ group elements. Finally, our approach neither requires a trusted setup nor FFTs, so it can be instantiated efficiently with any cycles of elliptic curves where DLOG is hard.

HyperNova: Recursive arguments for customizable constraint systems

We introduce HyperNova, a new recursive argument for proving incremental computations whose steps are expressed with CCS (Setty et al. ePrint 2023/552), a customizable constraint system that simultaneously generalizes Plonkish, R1CS, and AIR without overheads. HyperNova makes four contributions, each resolving a major problem in the area of recursive arguments.
First, it provides a folding scheme for CCS where the prover’s cryptographic cost is a single multi-scalar multiplication (MSM) of size equal to the number of variables in the constraint system, which is optimal when using an MSM-based commitment scheme. The folding scheme can fold multiple instances at once, making it easier to build generalizations of IVC such as PCD. Second, when proving program executions on stateful machines (e.g., EVM, RISC-V), the cost of proving a step of a program is proportional only to the size of the circuit representing the instruction invoked by the program step ("a la carte" cost profile). Third, we show how to achieve zero-knowledge for "free" and without the need to employ zero-knowledge SNARKs: we use a folding scheme to "randomize" IVC proofs. This highlights a new application of folding schemes. Fourth, we show how to efficiently instantiate HyperNova over a cycle of elliptic curves. For this, we provide a general technique, which we refer to as CycleFold, that applies to all modern folding-scheme-based recursive arguments.

Abuse-Resistant Location Tracking: Balancing Privacy and Safety in the Offline Finding Ecosystem

Location tracking accessories (or "tracking tags") such as those sold by Apple, Samsung, and Tile, allow owners to track the location of their property via offline finding networks. The tracking protocols were designed to ensure that no entity (including the vendor) can use a tag's broadcasts to surveil its owner. These privacy guarantees, however, seem to be at odds with the phenomenon of $\textit{tracker-based stalking}$, where attackers use these very tags to monitor a target's movements. Numerous such criminal incidents have been reported, and in response, manufacturers have chosen to substantially weaken privacy guarantees in order to allow users to detect stalker tags. This compromise has been adopted in a recent IETF draft jointly proposed by Apple and Google.
We put forth the notion of $\textit{abuse-resistant offline finding protocols}$ that aim to achieve a better balance between user privacy and stalker detection. We present an efficient protocol that achieves stalker detection under realistic conditions without sacrificing honest user privacy. At the heart of our result, and of independent interest, is a new notion of $\textit{multi-dealer secret sharing}$ which strengthens standard secret sharing with novel privacy and correctness guarantees. We show that this primitive can be instantiated efficiently on edge devices using variants of Interleaved Reed-Solomon codes combined with new lattice-based decoding algorithms.

Shared-Custodial Password-Authenticated Deterministic Wallets

Cryptographic wallets are an essential tool in Blockchain networks to ensure the secure storage and maintenance of an user's cryptographic keys. Broadly, wallets can be divided into three categories, namely custodial, non-custodial, and shared-custodial wallets. The first two are centralized solutions, i.e., the wallet is operated by a single entity, which inherently introduces a single point of failure. Shared-custodial wallets, on the other hand, are maintained by two independent parties, e.g., the wallet user and a service provider, and hence avoid the single point of failure centralized solutions. Unfortunately, current shared-custodial wallets suffer from significant privacy issues.
In our work, we introduce password-authenticated deterministic wallets (PADW), a novel and efficient shared-custodial wallet solution, which exhibits strong security and privacy guarantees. In a nutshell, in a PADW scheme, the secret key of the user is shared between the user and the server. In order to generate a signature, the user first authenticates itself to the server by providing a password and afterwards engages in an interactive signing protocol with the server. Security is guaranteed as long as at most one of the two parties is corrupted. Privacy, on the other hand, guarantees that a corrupted server cannot link a transaction to a particular user. We formally model the notion of PADW schemes and we give an instantiation from blind Schnorr signatures. Our construction allows for deterministic key derivation, a feature that is widely used in practice by existing wallet schemes, and it does not rely on any heavy cryptographic primitives. We prove our scheme secure against adaptive adversaries in the random oracle model and under standard assumptions. That is, our security proof only relies on the assumption that the Schnorr signature scheme is unforgeable and that a public key encryption scheme is CCA-secure.

PathGES: An Efficient and Secure Graph Encryption Scheme for Shortest Path Queries

The increasing importance of graph databases and cloud storage services prompts the study of private queries on graphs. We propose PathGES, a graph encryption scheme (GES) for single-pair shortest path queries. PathGES is efficient and mitigates the state-of-the-art attack by Falzon and Paterson (2022) on the GES by Ghosh, Kamara, and Tamassia (2021), while only incurring an additional logarithmic factor in storage overhead. PathGES leverages a novel data structure that minimizes leakage and server computation.
We generalize what it means for one leakage function to leak less than another by defining a relation with respect to a family of query sequences and show that our scheme provably leaks less than the GKT scheme when all queries have been issued. We complement our security proof with a cryptanalysis that demonstrates an information-theoretic gap in the size of the query reconstruction space of our scheme as compared to the GKT scheme and provide concrete examples of the gap for several graph families. Our prototype implementation of PathGES is efficient in practice for real-world social network and geographic data sets. In comparison with the GKT scheme, PathGES has on average the same response size and up to 1.5$\times$ faster round-trip query time.

On the Relationship between FuncCPA and FuncCPA+

Akavia, Gentry, Halevi, and Vald introduced the security notion of function-chosen-plaintext-attack (FuncCPA security) for public-key encryption schemes.
FuncCPA is defined by adding a functional re-encryption oracle to the IND-CPA game.
This notion is crucial for secure computation applications where the server is allowed to delegate a part of the computation to the client.
Dodis, Halevi, and Wichs introduced a stronger variant called FuncCPA$^+$.
They showed FuncCPA$^+$ implies FuncCPA and conjectured that FuncCPA$^+$ is strictly stronger than FuncCPA.
They left an open problem to clarify the relationship between these variants.
Contrary to their conjecture, we show that FuncCPA is equivalent to FuncCPA$^+$.
We show it by two proofs with a trade-off between the number of queries and the number of function inputs.
Furthermore, we show these parameters determine the security levels of FuncCPA and FuncCPA$^+$.

Respire: High-Rate PIR for Databases with Small Records

Private information retrieval (PIR) is a key building block in many privacy-preserving systems, and recent works have made significant progress on reducing the concrete computational costs of single-server PIR. However, existing constructions have high communication overhead, especially for databases with small records. In this work, we introduce Respire, a lattice-based PIR scheme tailored for databases of small records. To retrieve a single record from a database with over a million 256-byte records, the Respire protocol requires just 6.1 KB of online communication; this is a 5.9x reduction compared to the best previous lattice-based scheme. Moreover, Respire naturally extends to support batch queries. Compared to previous communication-efficient batch PIR schemes, Respire achieves a 3.4-7.1x reduction in total communication while maintaining comparable throughput (200-400 MB/s). The design of Respire relies on new query compression and response packing techniques based on ring switching in homomorphic encryption.

Zero-Knowledge Proofs of Training for Deep Neural Networks

A zero-knowledge proof of training (zkPoT) enables a party to prove that they have correctly trained a committed model based on a committed dataset without revealing any additional information about the model or the dataset. An ideal zkPoT should offer provable security and privacy guarantees, succinct proof size and verifier runtime, and practical prover efficiency. In this work, we present \name, a zkPoT targeted for deep neural networks (DNNs) that achieves all these goals at once. Our construction enables a prover to iteratively train their model via (mini-batch) gradient descent, where the number of iterations need not be fixed in advance; at the end of each iteration, the prover generates a commitment to the trained model parameters attached with a succinct zkPoT, attesting to the correctness of the executed iterations. The proof size and verifier time are independent of the number of iterations.
Our construction relies on two building blocks. First, we propose an optimized GKR-style (sumcheck-based) proof system for the gradient-descent algorithm with concretely efficient prover cost; this allows the prover to generate a proof for each iteration. We then show how to recursively compose these proofs across multiple iterations to attain succinctness. As of independent interest, we propose a generic framework for efficient recursive composition of GKR-style proofs, along with aggregatable polynomial commitments.
Benchmarks indicate that \name\ can handle the training of complex models such as VGG-11 with 10~million parameters and batch size~$16$. The prover runtime is $15$~minutes per iteration, which is $\mathbf{24 \times}$ faster than generic recursive proofs, with prover memory overhead $\mathbf{27\times}$ lower. The proof size is $1.63$~megabytes, and the verifier runtime is only $130$~milliseconds, where both are independent of the number of iterations and the size of the dataset.

A Crack in the Firmament: Restoring Soundness of the Orion Proof System and More

Orion (Xie et al. CRYPTO'22) is a recent plausibly post-quantum zero-knowledge argument system with a linear time prover. It improves over Brakedown (Golovnev et al. ePrint'21 and CRYPTO'23) by reducing proof size and verifier complexity to be polylogarithmic and additionally adds the zero-knowledge property. The argument system is demonstrated to be concretely efficient with a prover time being the fastest among all existing succinct proof systems and a proof size that is an order of magnitude smaller than Brakedown. Since its publication in CRYPTO 2022, two revisions have been made to the zk-SNARK. First, there was an issue with how zero-knowledge was handled. Second, Orion was discovered to be unsound, which was then repaired through the use of a commit-and-prove SNARK as an ``outer'' SNARK.
As we will show in this paper, unfortunately, Orion in its current revision is still unsound (with and without the zero-knowledge property) and we will demonstrate practical attacks on it. We then show how to repair Orion without additional assumptions, which requies non-trivial fixes when aiming to preserve the linear time prover complexity. The proposed fixes lead to an even improved efficiency, i.e., smaller proof size and verifier time, over the claimed efficiency of the initial version of Orion. Moreover, we provide the first rigorous security proofs and explicitly consider multi-point openings and non-interactivity. While revisiting Orion we make some additional contributions which might be of independent interest, most notable an improved code randomization technique that retains the minimum relative distance.

Efficient Threshold FHE for Privacy-Preserving Applications

Threshold Fully Homomorphic Encryption (ThFHE) enables arbitrary computation over encrypted data while keeping the decryption key distributed across multiple parties at all times. ThFHE is a key enabler for threshold cryptography and, more generally, secure distributed computing. Existing ThFHE schemes relying on standard hardness assumptions, inherently require highly inefficient parameters and are unsuitable for practical deployment. In this paper, we take a novel approach towards making ThFHE practically usable by (i) proposing an efficient ThFHE scheme with a new analysis resulting in significantly improved parameters; (ii) and providing the first practical ThFHE implementation benchmark based on Torus FHE.
• We propose the first practical ThFHE scheme with a polynomial modulus-to-noise ratio that supports practically efficient parameters while retaining provable security based on standard quantum-safe assumptions. We achieve this via R ́enyi divergence-based security analysis of our proposed threshold decryption mechanism.
• We present a prototype software implementation of our proposed ThFHE scheme that builds upon the existing Torus-FHE library and supports (distributed) decryption on highly resource-constrained ARM-based handheld devices. Along the way, we implement several extensions to the Torus FHE library, including a Torus-based linear integer secret sharing subroutine to support ThFHE key sharing and distributed decryption for any threshold access structure.
We illustrate the efficacy of our proposal via an end-to-end use case involving encrypted computations over a real medical database and distributed decryptions of the computed result on resource-constrained ARM-based handheld devices.

On the Number of Restricted Solutions to Constrained Systems and their Applications

In this paper, we formulate a special class of systems of linear equations over finite fields and derive lower bounds on the number of solutions adhering to some predefined restrictions. We then demonstrate the applications of these lower bounds to derive tight PRF security (up to $2^{3n/4}$ queries) for single-keyed variants of the Double-block Hash-then-Sum (DBHtS) paradigm, specifically PMAC+ and LightMAC+. Additionally, we show that the sum of $r$ independent copies of the Even-Mansour cipher is a secure PRF up to $2^{\frac{r}{r+1}n}$ queries.

Quantum One-Wayness of the Single-Round Sponge with Invertible Permutations

Sponge hashing is a widely used class of cryptographic hash algorithms which underlies the current international hash function standard SHA-3. In a nutshell, a sponge function takes as input a bit-stream of any length and processes it via a simple iterative procedure: it repeatedly feeds each block of the input into a so-called block function, and then produces a digest by once again iterating the block function on the final output bits. While much is known about the post-quantum security of the sponge construction when the block function is modeled as a random function or one-way permutation, the case of invertible permutations, which more accurately models the construction underlying SHA-3, has so far remained a fundamental open problem.
In this work, we make new progress towards overcoming this barrier and show several results. First, we prove the ``double-sided zero-search'' conjecture proposed by Unruh (eprint' 2021) and show that finding zero-pairs in a random $2n$-bit permutation requires at least $\Omega(2^{n/2})$ many queries---and this is tight due to Grover's algorithm. At the core of our proof lies a novel ``symmetrization argument'' which uses insights from the theory of Young subgroups. Second, we consider more general variants of the double-sided search problem and show similar query lower bounds for them. As an application, we prove the quantum one-wayness of the single-round sponge with invertible permutations in the quantum random oracle model.

Perceived Information Revisited II: Information-Theoretical Analysis of Deep-Learning Based Side-Channel Attacks

Previous studies on deep-learning-based side-channel attacks (DL-SCAs) have shown that traditional performance evaluation metrics commonly used in DL, like accuracy and F1 score, are not effective in evaluating DL-SCA performance. Therefore, some previous studies have proposed new alternative metrics for evaluating the performance of DL-SCAs. Notably, perceived information (PI) and effective perceived information (EPI) are major metrics based on information theory. While it has been experimentally confirmed that these metrics can give the attack success rate (SR) for DL-SCAs, their theoretical validity remains unclear.
In this paper, we propose a new theoretically valid performance evaluation metric called latent perceived information (LPI), which serves as an alternative to the existing metrics. LPI is defined as the mutual information between the output of the feature extractor of a neural network (NN) model and the intermediate value, representing the potential attack performance of the trained model. First, we prove that LPI provides an upper bound on the SR of a DL-SCA by modeling and formulating DL-SCA as a communication channel. Additionally, we clarify the conditions under which PI and EPI theoretically provide an upper bound on the SR from the perspective of LPI. For practical computation of LPI, we present two methods. One utilizes the Kraskov (KSG) estimator, a common mutual information estimator, and the other is based on the logistic regression. While the KSG estimator is computationally intensive, it yields accurate LPI values. In contrast, the logistic regression is faster but provides a lower bound for LPI. Through experimental attacks on AES software and hardware implementations with masking countermeasures, we demonstrate that the LPI values estimated by these two methods are significantly similar, indicating the reliability and soundness of our proposed estimation techniques. Furthermore, we present the use of a classifier based on logistic regression to improve the attack performance of the trained model. We experimentally demonstrate that an NN model with the logistic regression-based classifier can achieve the upper bound of attack performance predicted by LPI, meaning a significant improvement in attack performance from the original NN. Thus, our study contributes to realizing the optimal distinguisher using the trained model in terms of attack performance.

Practical Traceable Receipt-Free Encryption

Traceable Receipt-free Encryption (TREnc) is a verifiable public-key encryption primitive introduced at Asiacrypt 2022. A TREnc allows randomizing ciphertexts in transit in order to remove any subliminal information up to a public trace that ensures the non-malleability of the underlying plaintext. A remarkable property of TREnc is the indistinguishability of the randomization of chosen ciphertexts against traceable chosen-ciphertext attacks (TCCA). This property can support applications like voting, and it was shown that receipt-free non-interactive voting, where voters are unable to convince any third party of the content of their vote, can be generically built from a TREnc.
While being a very promising primitive, the few existing TREnc mechanisms either require a secret coin CRS or are fairly demanding in time and space requirements. Their security proofs also come with a linear security degradation in the number of challenge ciphertexts.
We address these limitations and offer two efficient public coin TREnc mechanisms tailored for the two common tallying approaches in elections: homomorphic and mixnet-based. The TCCA security of our mechanisms also enjoys an almost-tight reduction to SXDH, based on a new randomizable technique of independent interest in the random oracle model.
A Rust implementation of our TREnc mechanisms demonstrates that we can verifiably encrypt 64 bits in less than a second, and full group elements in around 30 ms., which is sufficient for most real-world applications. While comparing with other solutions, we show that our approaches lead to the most efficient non-interactive receipt-free voting system to date.

Application of Mordell-Weil lattices with large kissing numbers to acceleration of multi-scalar multiplication on elliptic curves

This article aims to speed up (the precomputation stage of) multi-scalar multiplication (MSM) on ordinary elliptic curves of $j$-invariant $0$ with respect to specific ''independent'' (a.k.a. ''basis'') points. For this purpose, so-called Mordell--Weil lattices (up to rank $8$) with large kissing numbers (up to $240$) are employed. In a nutshell, the new approach consists in obtaining more efficiently a considerable number (up to $240$) of certain elementary linear combinations of the ``independent'' points. By scaling the point (re)generation process, it is thus possible to get a significant performance gain. As usual, the resulting curve points can be then regularly used in the main stage of an MSM algorithm to avoid repeating computations. Seemingly, this is the first usage of lattices with large kissing numbers in cryptography, while such lattices have already found numerous applications in other mathematical domains. Without exaggeration, the article results can strongly affect performance of today's real-world elliptic curve cryptography, since MSM is a widespread primitive (often the unique bottleneck) in modern protocols. Moreover, the new (re)generation technique is prone to further improvements by considering Mordell--Weil lattices with even greater kissing numbers.

Polylogarithmic Proofs for Multilinears over Binary Towers

We present a succinct polynomial commitment scheme for multilinears over tiny binary fields, including that with just 2 elements. To achieve this, we develop two main ideas. Our first adapts Zeilberger, Chen and Fisch's BaseFold ('23) PCS to the binary setting; it uses FRI (ICALP '18)'s lesser-known binary variant, and reveals a new connection between that work and Lin, Chung and Han (FOCS '14)'s additive NTT. We moreover present a novel large-field-to-small-field compiler for polynomial commitment schemes. Using a technique we call "ring-switching", our compiler generically bootstraps any multilinear PCS over a large, power-of-two degree extension field into a further PCS over that field's ground field. The resulting small-field PCS lacks "embedding overhead", in that its commitment cost is identical to that of the large-field scheme on each input size (measured in bits). We attain concretely small proofs for enormous binary multilinears, shrinking the proofs of Diamond and Posen ('23) by an order of magnitude.

On the Concrete Security of Non-interactive FRI

FRI is a cryptographic protocol widely deployed today as a building
block of many efficient SNARKs that help secure transactions of hundreds of
millions of dollars per day. The Fiat-Shamir security of FRI—vital for understanding
the security of FRI-based SNARKs—has only recently been formalized and
established by Block et al. (ASIACRYPT ’23).
In this work, we complement the result of Block et al. by providing a thorough
concrete security analysis of non-interactive FRI under various parameter settings
from protocols deploying (or soon to be deploying) FRI today. We find that these
parameters nearly achieve their desired security targets (being at most 1-bit less
secure than their targets) for non-interactive FRI with respect to a certain security
conjecture about the FRI Protocol. However, in all but one set of parameters, we
find that the provable security of non-interactive FRI under these parameters is
severely lacking, being anywhere between 21- and 63-bits less secure than the
conjectured security. The conjectured security of FRI assumes that known attacks
are optimal, the security of these systems would be severely compromised should
a better attack be discovered. In light of this, we present parameter guidelines
for achieving 100-bits of provable security for non-interactive FRI along with a
methodology for tuning these parameters to suit the needs of protocol designers.

Faster Private Decision Tree Evaluation for Batched Input from Homomorphic Encryption

Privacy-preserving decision tree evaluation (PDTE) allows a client that holds feature vectors to perform inferences against a decision tree model on the server side without revealing feature vectors to the server. Our work focuses on the non-interactive batched setting where the client sends a batch of encrypted feature vectors and then obtains classifications, without any additional interaction. This is useful in privacy-preserving credit scoring, biometric authentication, and many more applications.
In this paper, we propose two novel non-interactive batched PDTE protocols, OursRCC and OursCW, based on two batched ciphertext-plaintext comparison algorithms, our batched range cover comparison (RCC) comparator and the constant-weight (CW) piece-wise comparator, respectively. When comparing 16-bit batched encrypted values to a single plaintext value, our comparison algorithms show a speedup of up to $72\times$ compared to the state-of-the-art Level Up (CCS'23). Moreover, we introduced a new tree traversal method called adapted SumPath, to achieve $\mathcal{O}(1)$ complexity of the server's response, whereas Level Up has $\mathcal{O}(2^d)$ complexity for a depth-$d$ tree and the client needs to look up classification values in a table.
Overall, our PDTE protocols attain the optimal server-to-client communication complexity and are up to $17\times$ faster than Level Up in batch size 16384.

Post-Quantum Access Control with Application to Secure Data Retrieval

Servan-Schreiber et al. (S&P 2023) presented a new notion called private access control lists (PACL) for function secret sharing (FSS), where the FSS evaluators can ensure that the FSS dealer is authorized to share the given function. Their construction relies on costly non-interactive secret-shared proofs and is not secure in post-quantum setting. We give a construction of PACL from publicly verifiable secret sharing (PVSS) under short integer solution (SIS). Our construction adapts the Gentry et al’s scheme (Eurocrypt 2022) for post-quantum setting based on learning with error assumption (LWE). The implementation of our PACL with different files showed that it is feasible even at different sizes, and should remain so even with large secret vectors. This construction has many applications for access control by applying FSS. We show how to apply the proposed PACL construction to secure data retrieval. We also present a scheme for secure data retrieval with access control, which might be of independent interest.

LaPSuS – A Lattice-Based Private Stream Aggregation Scheme under Scrutiny

Private Stream Aggregation (PSA) allows clients to send encryptions of their private values to an aggregator that is then able to learn the sum of these values but nothing else. It has since found many applications in practice, e.g. for smart metering or federated learning. In 2018, Becker et al. proposed the first lattice-based PSA scheme LaPS (NDSS 2018), with putative post-quantum security, which has subsequently been patented. In this paper, we describe two attacks on LaPS that break the claimed aggregator obliviousness security notion, where the second attack even allows to recover the secret keys of the clients, given enough encryptions. Moreover, we review the PSA literature for other occurrences of the responsible flawed proof steps. By explicitly tracking down and discussing these flaws, we clarify and hope to contribute to the literature on PSA schemes, in order to prevent further insecure schemes in practice. Finally, we point out that a Real-or-Random variant of the security notion that is often used as a substitute to make proofs easier, is not well-defined and potentially weaker than the standard PSA security notion. We propose a well defined variant and show that it is equivalent to the standard security notion of PSA under mild assumptions.

Fast Parallelizable Misuse-Resistant Authenticated Encryption: Low Latency (Decryption-Fast) SIV

MRAE security is an important goal for many AEAD applications where the nonce uniqueness cannot be maintained and security risks are significant. However, MRAE schemes can be quite expensive. Two of the SoTA MRAE-secure schemes; Deoxys-II and AES-GCM-SIV rely on internal parallelism and special instructions to achieve competitive performance. However, they both suffer from the same bottleneck, they have at least one call to the underlying primitive that cannot be parallelized to any other call. Romulus-M and LMDAE are two other more recent MRAE secure schemes based on TBCs that target low area hardware. However, they are unparallelizable so they are slower than their counterparts.
In this paper, we present two new AEAD modes and four instantiations based on Tweakable Block Ciphers. These new modes target equipping high-speed applications on parallel platforms with nonce misuse resistant AEAD (MRAE). The first mode, LLSIV, targets similar performance on single-core platforms to SCT-2, while eliminating the bottlenecks that make SCT-2 not fully parallelizable. The enhanced parallelism allows LLSIV to encrypt significantly more blocks on parallel platforms, compared to SCT-2, in the same amount of time. LLSIV is based on the NaT MAC, where each ciphertext block can itself be viewed as an instance of NaT when the plaintext is prepended with
. The trade-off is that LLSIV requires the inverse function of the TBC. However, the inverse function is used only once per message and we demonstrate that for parallel implementations it represents a very small overhead.
We give an instantiation of LLSIV based on the SKINNY-128-384 TBC, and a pruned scheme, dubbed pLLSIV, which targets enhanced performance compared both SCT-2 and LLSIV on all platforms, while having reduced security claims. It relies on the recently popularized prove-then-prune methodology to take full advantage of the properties of LLSIV. This leads to a significant performance improvement, making pLLSIV even faster than online TBC-based schemes that are not MRAE-secure. Last but not least, we give an instantiation that uses the primitives used in AES-GCM-SIV: the PolyVal hash function and AES. Our instantiation is faster than AES-GCM-SIV on all platforms and have better bounds. On the other hand, it relies on the ideal cipher model as it uses the ICE TBC proposed as part of the Remus AEAD design.
The second mode we describe is LLDFV. It uses ideas from LLSIV combined the Decryption-Fast SIV (DFV) framework proposed recently by Minematsu. The goal is to reduce the number of calls to the TBC by one, while making the scheme as parallelizable as LLSIV. This makes the scheme faster that DFV on all platforms.

Scalable Agreement Protocols with Optimal Optimistic Efficiency

Designing efficient distributed protocols for various agreement tasks such as Byzantine Agreement, Broadcast, and Committee Election is a fundamental problem. We are interested in $scalable$ protocols for these tasks, where each (honest) party communicates a number of bits which is sublinear in $n$, the number of parties. The first major step towards this goal is due to King et al. (SODA 2006) who showed a protocol where each party sends only $\tilde O(1)$ bits throughout $\tilde O(1)$ rounds, but guarantees only that $1-o(1)$ fraction of honest parties end up agreeing on a consistent output, assuming constant $<1/3$ fraction of static corruptions. Few years later, King et al. (ICDCN 2011) managed to get a full agreement protocol in the same model but where each party sends $\tilde O(\sqrt{n})$ bits throughout $\tilde O(1)$ rounds. Getting a full agreement protocol with $o(\sqrt{n})$ communication per party has been a major challenge ever since.
In light of this barrier, we propose a new framework for designing efficient agreement protocols. Specifically, we design $\tilde O(1)$-round protocols for all of the above tasks (assuming constant $<1/3$ fraction of static corruptions) with optimistic and pessimistic guarantees:
$\bullet$ $Optimistic$ $complexity$: In an honest execution, (honest) parties send only $\tilde O(1)$ bits.
$\bullet$ $Pessimistic$ $complexity$: In any other case, (honest) parties send $\tilde O(\sqrt{n})$ bits.
Thus, all an adversary can gain from deviating from the honest execution is that honest parties will need to work harder (i.e., transmit more bits) to reach agreement and terminate. Besides the above agreement tasks, we also use our new framework to get a scalable secure multiparty computation (MPC) protocol with optimistic and pessimistic complexities.
Technically, we identify a relaxation of Byzantine Agreement (of independent interest) that allows us to fall-back to a pessimistic execution in a coordinated way by all parties. We implement this relaxation with $\tilde O(1)$ communication bits per party and within $\tilde O(1)$ rounds.

An analysis of the Crossbred Algorithm for the MQ Problem

The Crossbred algorithm is currently the state-of-the-art method for solving overdetermined multivariate polynomial systems over $\mathbb{F}_2$. Since its publication in 2015, several record breaking implementations have been proposed and demonstrate the power of this hybrid approach. Despite these practical results, the complexity of this algorithm and the choice of optimal parameters for it are difficult open questions. In this paper, we prove a bivariate generating series for potentially admissible parameters of the Crossbred algorithm.

CoGNN: Towards Secure and Efficient Collaborative Graph Learning

Collaborative graph learning represents a learning paradigm where multiple parties jointly train a graph neural network (GNN) using their own proprietary graph data. To honor the data privacy of all parties, existing solutions for collaborative graph learning are either based on federated learning (FL) or secure machine learning (SML). Although promising in terms of efficiency and scalability due to their distributed training scheme, FL-based approaches fall short in providing provable security guarantees and achieving good model performance. Conversely, SML-based solutions, while offering provable privacy guarantees, are hindered by their high computational and communication overhead, as well as poor scalability as more parties participate.
To address the above problem, we propose CoGNN, a novel framework that simultaneously reaps the benefits of both FL-based and SML-based approaches. At a high level, CoGNN is enabled by (i) a novel message passing mechanism that can obliviously and efficiently express the vertex data propagation/aggregation required in GNN training and inference and (ii) a two-stage Dispatch-Collect execution scheme to securely decompose and distribute the GNN computation workload for concurrent and scalable executions. We further instantiate the CoGNN framework, together with customized optimizations, to train Graph Convolutional Network (GCN) models. Extensive evaluations on three graph datasets demonstrate that compared with the state-of-the-art (SOTA) SML-based approach, CoGNN reduces up to $123$x running time and up to $522$x communication cost per party. Meanwhile, the GCN models trained using CoGNN have nearly identical accuracies as the plaintext global-graph training, yielding up to $11.06\%$ accuracy improvement over the GCN models trained via federated learning.

A Note on `` Provably Secure and Lightweight Authentication Key Agreement Scheme for Smart Meters''

We show that the authentication key agreement scheme
[IEEE Trans. Smart Grid, 2023, 14(5), 3816-3827] is flawed due to its inconsistent computations. We also show that the scheme fails to keep anonymity, not as claimed.

QuietOT: Lightweight Oblivious Transfer with a Public-Key Setup

Oblivious Transfer (OT) is at the heart of secure computation and is a foundation for many applications in cryptography. Over two decades of work have led to extremely efficient protocols for evaluating OT instances in the preprocessing model, through a paradigm called OT extension.
A few OT instances generated in an offline phase can be used to perform many OTs in an online phase efficiently, i.e., with very low communication and computational overheads.
Specifically, traditional OT extension protocols use a small number of “base” OTs, generated using any black-box OT protocol, and convert them into many OT instances using only lightweight symmetric-key primitives.
Recently, a new paradigm of OT with a *public-key setup* has emerged, which replaces the base OTs with a non-interactive setup: Using only the public key of the other party, two parties can efficiently compute a virtually unbounded number of OT instances on-the-fly.
In this paper, we put forth a novel framework for OT extension with a public-key setup and concretely efficient instantiations. An implementation of our framework is 30-100 times faster when compared to the previous state-of-the-art public-key OT protocols, and remains competitive even when compared to OT protocols that *do not* offer a public-key setup. Additionally, our instantiations result in the first public-key schemes with plausible post-quantum security.
In summary, this paper contributes:
- QuietOT: A framework for OT extension with a public-key setup that uses fast, symmetric-key primitives to generate OT instances following a one-time public-key setup, and offering additional features such as precomputability.
- A public-key setup for QuietOT from the RingLWE assumption, resulting in the first post-quantum construction of OT extension with a public-key setup.
- An optimized, open-source implementation of our construction that can generate up to 1M OT extensions per second on commodity hardware. In contrast, the state-of-the-art public-key OT protocol is limited to approximately 20K OTs per second.
- The first formal treatment of the security of OT with a public-key setup in a multi-party setting, which addresses several subtleties that were overlooked in prior work.

Shift-invariant functions and almost liftings

We investigate shift-invariant vectorial Boolean functions on $n$ bits that are lifted from Boolean functions on $k$ bits, for $k\leq n$. We consider vectorial functions that are not necessarily permutations, but are, in some sense, almost bijective. In this context, we define an almost lifting as a Boolean function for which there is an upper bound on the number of collisions of its lifted functions that does not depend on $n$. We show that if a Boolean function with diameter $k$ is an almost lifting, then the maximum number of collisions of its lifted functions is $2^{k-1}$ for any $n$. Moreover, we search for functions in the class of almost liftings that have good cryptographic properties and for which the non-bijectivity does not cause major security weaknesses. These functions generalize the well-known map $\chi$ used in the Keccak hash function.

On affine forestry over integral domains and families of deep Jordan-Gauss graphs

Let K be a commutative ring. We refer to a connected bipartite graph G = G_n(K) with partition sets P = K^n (points) and L = K^n (lines) as an affine graph over K of dimension dim(G) = n if the neighbourhood of each vertex is isomorphic to K. We refer to G as an algebraic affine graph over K if the incidence between a point (x_1, x_2, . . . , x_n)
and line [y_1, y_2, . . . , y_n] is defined via a system of polynomial equations of the kind f_i = 0 where f_i ∈ K[x_1, x_2, . . . , x_n, y_1, y_2, . . . , y_n]. We say that an affine algebraic graph is a Jordan-Gauss graph over K if the incidences between points and lines are given by a
quadratic system of polynomial equations, and the neighbourhood of each vertex is given as a solution set of the system of linear equations in row-echelon form.
For each integral domain K we consider the known explicit construction of the family of Jordan-Gauss graphs A(n, K), n = 2, 3, . . . with cycle indicator ≥ 2n + 2. Additionally several constructions of families of edge intransitive Jordan-Gauss graphs over K of
increasing girth with well defined projective limit will be presented. This projective limit is a forest defined by the system of algebraic equations. In the case K = F_q, q ≥ 3 we present results of computer experiments for the evaluation of girth, cycle indicator, diameter and the second largest eigenvalue of the constructed graphs, and we formulate
several conjectures on their properties. One of the conjectures is that the girth of A(n, F_q) is 2[(n+ 5)/2]. We discuss briefly some applications of Jordan-Gauss graphs of large girth to Graph Theory, Algebraic Geometry and the theory of LDPC codes; and we consider ideas to use groups related to these graphs in Noncommutative Cryptography and Stream Ciphers Design.

A Central Limit Framework for Ring-LWE Noise Analysis

Uncategorized

Uncategorized

This paper develops Central Limit arguments for analysing the noise in ciphertexts in two homomorphic encryption schemes that are based on Ring-LWE. The first main contribution of this paper is to present and evaluate an average-case noise analysis for the BGV scheme. Our approach relies on the recent work of Costache et al. (SAC 2023) that gives the approximation of a polynomial product as a multivariate Normal distribution. We show how this result can be applied in the BGV context and evaluate its efficacy. We find this average-case approach can much more closely model the noise growth in BGV implementations than prior approaches, but in some cases it can also underestimate the practical noise growth. Our second main contribution is to develop a Central Limit framework to analyse the noise growth in the homomorphic Ring-LWE cryptosystem of Lyubashevsky, Peikert and Regev (Eurocrypt 2013, full version). Our approach is very general: apart from finite variance, no assumption on the distribution of the noise is required (in particular, the noise need not be subgaussian). We show that our approach leads to tighter bounds for the probability of decryption failure than those of prior work.

Simple Threshold (Fully Homomorphic) Encryption From LWE With Polynomial Modulus

The learning with errors (LWE) assumption is a powerful tool for building encryption schemes with useful properties, such as plausible resistance to quantum computers, or support for homomorphic computations. Despite this, essentially the only method of achieving threshold decryption in schemes based on LWE requires a modulus that is superpolynomial in the security parameter, leading to a large overhead in ciphertext sizes and computation time.
In this work, we propose a (fully homomorphic) encryption scheme that supports a simple $t$-out-of-$n$ threshold decryption protocol while allowing for a polynomial modulus. The main idea is to use the Rényi divergence (as opposed to the statistical distance as in previous works) as a measure of distribution closeness. This comes with some technical obstacles, due to the difficulty of using the Rényi divergence in decisional security notions such as standard semantic security. We overcome this by constructing a threshold scheme with a weaker notion of one-way security and then showing how to transform any one-way (fully homomorphic) threshold scheme into one guaranteeing (selective) indistinguishability-based security.

On the Semidirect Discrete Logarithm Problem in Finite Groups

We present an efficient quantum algorithm for solving the semidirect discrete logarithm problem (SDLP) in any finite group. The believed hardness of the semidirect discrete logarithm problem underlies more than a decade of works constructing candidate post-quantum cryptographic algorithms from nonabelian groups. We use a series of reduction results to show that it suffices to consider SDLP in finite simple groups. We then apply the celebrated Classification of Finite Simple Groups to consider each family. The infinite families of finite simple groups admit, in a fairly general setting, linear algebraic attacks providing a reduction to the classical discrete logarithm problem. For the sporadic simple groups, we show that their inherent properties render them unsuitable for cryptographically hard SDLP instances, which we illustrate via a Baby-Step Giant-Step style attack against SDLP in the Monster Group.
Our quantum SDLP algorithm is fully constructive for all but three remaining cases that appear to be gaps in the literature on constructive recognition of groups; for these cases SDLP is no harder than finding a linear representation. We conclude that SDLP is not a suitable post-quantum hardness assumption for any choice of finite group.

Cross Ledger Transaction Consistency for Financial Auditing

Auditing throughout a fiscal year is integral to organizations with transactional activity. Organizations transact with each other and record the details for all their economical activities so that a regulatory committee can verify the lawfulness and legitimacy of their activity. However, it is computationally infeasible for the committee to perform all necessary checks for each organization. To overcome this, auditors assist in this process: organizations give access to all their internal data to their auditors, who then produce reports regarding the consistency of the organization's data, alerting the committee to any inconsistencies. Despite this, numerous issues that result in fines annually revolve around such inconsistencies in bookkeeping across organizations. Notably, committees wishing to verify the correctness of auditor-provided reports need to redo all their calculations; a process which is computationally proportional to the number of organizations. In fact, it becomes prohibitive when considering real-world settings with thousands of organizations. In this work, we propose two protocols, CLOSC and CLOLC, whose goals are to enable auditors and a committee to verify the consistency of transactions across different ledgers. Both protocols ensure that for every transaction recorded in an organization's ledger, there exists a dual one in the ledger of another organization while safeguarding against other potential attacks. Importantly, we minimize the information leakage to auditors and other organizations and guarantee three crucial security and privacy properties that we propose: (i) transaction amount privacy, (ii) organization-auditor unlinkability, and (iii) transacting organizations unlinkability. At the core of our protocols lies a two-tier ledger architecture alongside a suite of cryptographic tools. To demonstrate the practicality and scalability of our designs, we provide extensive performance evaluation for both CLOSC and CLOLC. Our numbers are promising, i.e., all computation and verification times lie in the range of seconds, even for millions of transactions, while the on-chain storage costs for an auditing epoch are encouraging i.e. in the range of GB for millions of transactions and thousands of organizations.

Anonymous Broadcast Authentication with Logarithmic-Order Ciphertexts from DLP or LWE

We propose an anonymous broadcast authentication (ABA) scheme to simultaneously control massive numbers of devices in practical resources.
As a theoretical foundation, we find a barrier in constructing an ABA scheme that can control numerous devices: a trilemma between (i) security, (ii) ciphertext length, and (iii) freedom of target device selection.
Therefore, we propose ABAs with ciphertext sizes of $O(\log N)$, where $N$ is the number of target devices and impose a certain restriction on (iii).
We provide an ABA template and instantiate it into specific schemes from the decisional Diffie-Hellman problem or the learning with errors problem.
Further, we provide example parameters and resource consumption of space and time.

Blockchain Space Tokenization

Handling congestion in blockchain systems is a fundamental problem given that the security and decentralization objectives of such systems lead to designs that compromise on (horizontal) scalability (what sometimes is referred to as the ``blockchain trilemma''). Motivated by this, we focus on the question whether it is possible to design a transaction inclusion policy for block producers that facilitates fee and delay predictability while being incentive compatible at the same time.
Reconciling these three properties is seemingly paradoxical given that the dominant approach to transaction processing is based on first-price auctions (e.g., as in Bitcoin) or dynamic adjustment of the minimum admissible fee (e.g. as in Ethereum EIP-1559) something that breaks fee predictability. At the same time, in fixed fee mechanisms (e.g., as in Cardano), fees are trivially predictable but are subject to relatively inexpensive bribing or denial of service attacks where transactions may be delayed indefinitely by a well funded attacker, hence breaking delay predictability.
In this work, we set out to address this problem by putting forward blockchain space tokenization (BST), namely a new capability of a blockchain system to tokenize its capacity for transactions and allocate it to interested users who are willing to pay ahead of time for the ability to post transactions regularly for a period of time. We analyze our system in the face of worst-case transaction-processing attacks by introducing a security game played between the mempool mechanism and an adversary. Leveraging this framework, we prove that BST offers predictable and asymptotically optimal delays, predictable fees, and is incentive compatible, thus answering the question posed in the affirmative.

Exploiting the Central Reduction in Lattice-Based Cryptography

This paper questions the side-channel security of central reduction technique, which is widely adapted in efficient implementations of Lattice-Based Cryptography (LBC). We show that the central reduction leads to a vulnerability by creating a strong dependency between the power consumption and the sign of sensitive intermediate values. We exploit this dependency by introducing the novel absolute value prediction function, which can be employed in higher-order non-profiled multi-query Side-Channel Analysis (SCA) attacks. Our results reveal that – compared to classical reduction algorithms – employing the central reduction scheme leads to a two-orders-of-magnitude decrease in the number of required SCA measurements to exploit secrets of masked implementations. We particularly show that our approach is valid for the prime moduli employed by Kyber and Dilithium, the lattice-based post-quantum algorithms selected by NIST. We practically evaluate our introduced approach by performing second-order non-profiled attacks against an open-source masked implementation of Kyber on an ARM Cortex-M4 micro-processor. In our experiments, we revealed the full secret key of the aforementioned masked implementation with only 250 power traces without any forms of profiling or choosing the ciphertexts.

Elliptic Curve Cryptography for the masses: Simple and fast finite field arithmetic

Shaped prime moduli are often considered for use in elliptic curve and isogeny-based cryptography to allow for faster modular reduction. Here we focus on the most common choices for shaped primes that have been suggested, that is pseudo-Mersenne, generalized Mersenne and Montgomery-friendly primes. We consider how best to to exploit these shapes for maximum efficiency, and provide an open source tool to automatically generate, test and time working high-level language finite-field code. Next we consider the advantage to be gained from implementations that are written in assembly language and which exploit special instructions, SIMD hardware if present, and the particularities of the algorithm being implemented.

Designated-Verifier zk-SNARKs Made Easy

Zero-knowledge succinct non-interactive argument of knowledge (zk-SNARK) is a kind of proof system that enables a prover to convince a verifier that an NP statement is true efficiently. In the last decade, various studies made a lot of progress in constructing more efficient and secure zk-SNARKs. Our research focuses on designated-verifier zk-SNARKs, where only the verifier knowing some secret verification state can be convinced by the proof. A natural idea of getting a designated-verifier zk-SNARK is encrypting a publicly-verifiable zk-SNARK's proof via public-key encryption. This is also the core idea behind the well-known transformation proposed by Bitansky et al. in TCC 2013 to obtain designated-verifier zk-SNARKs. However, the transformation only applies to zk-SNARKs which requires the complicated trusted setup phase and sticks on storage-expensive common reference strings. The loss of the secret verification state also makes the proof immediately lose the designated-verifier property.
To address these issues, we first define "strong designated-verifier" considering the case where the adversary has access to the secret verification state, then propose a construction of strong designated-verifier zk-SNARKs. The construction inspired by designated verifier signatures based on two-party ring signatures does not use encryption and can be applied on any public-verifiable zk-SNARKs to yield a designated-verifiable variant. We introduce our construction under the circuit satisfiability problem and implement it in Circom, then test it on different zk-SNARKs, showing the validity of our construction.

Universal Vector Commitments

We propose a new notion of vector commitment schemes with proofs of (non-)membership that we call universal vector commitments. We show how to build them directly from (i) Merkle commitments, and (ii) a universal accumulator and a plain vector commitment scheme. We also present a generic construction for universal accumulators over large domains from any vector commitment scheme, using cuckoo hashing. Leveraging the aforementioned generic constructions, we show that universal vector commitment schemes are implied by plain vector commitments and cuckoo hashing.

Identity-Based Matchmaking Encryption, Revisited: Improved Constructions with Strong Security

Identity-based matchmaking encryption (IB-ME) [Ateniese et al. Crypto 2019] allows users to communicate privately in an anonymous and authenticated manner. After the seminal paper by Ateniese et al., a lot of work has been done on the security and construction of IB-ME. In this work, we revisit the security definitions of IB-ME and provide improved constructions of it. First, we classify the existing security notions of IB-ME, systematically categorizing privacy into three categories (CPA, CCA, and privacy in the case of mismatch) and authenticity into four categories (NMA and CMA both against insiders and outsiders).In particular, we reconsider the privacy when the sender's identity is mismatched during decryption, and provide a new simple security game, called mismatch security, capturing the essence of it. Second, we propose efficient and strongly secure IB-ME schemes from the bilinear Diffie-Hellman assumption in the random oracle model and from anonymous identity-based encryption, identity-based signature, and reusable extractors in the standard model. The first scheme is based on Boneh-Franklin IBE similar to the Ateniese et al. scheme, but ours achieves a more compact decryption key and ciphertext and stronger CCA-privacy, CMA-authenticity, and mismatch security. The second scheme is an improved generic construction, which active not only stronger security but also the shortest ciphertext among existing generic constructions. Through this construction, we obtain, for example, a more efficient scheme from the symmetric external Diffie-Hellman assumption in the standard model, and a practical scheme from lattices in the quantum random oracle model.

Secure Multiparty Computation of Symmetric Functions with Polylogarithmic Bottleneck Complexity and Correlated Randomness

Bottleneck complexity is an efficiency measure of secure multiparty computation (MPC) protocols introduced to achieve load-balancing in large-scale networks, which is defined as the maximum communication complexity required by any one player within the protocol execution. Towards the goal of achieving low bottleneck complexity, prior works proposed MPC protocols for computing symmetric functions in the correlated randomness model, where players are given input-independent correlated randomness in advance. However, the previous protocols with polylogarithmic bottleneck complexity in the number of players $n$ require a large amount of correlated randomness that is linear in $n$, which limits the per-party efficiency as receiving and storing correlated randomness are the bottleneck for efficiency. In this work, we present for the first time MPC protocols for symmetric functions such that bottleneck complexity and the amount of correlated randomness are both polylogarithmic in $n$, assuming collusion of size at most $n-o(n)$ players. Furthermore, one of our protocols is even computationally efficient in that each player performs only $\mathrm{polylog}(n)$ arithmetic operations while the computational complexity of the previous protocols is $O(n)$. Technically, our efficiency improvements come from novel protocols based on ramp secret sharing to realize basic functionalities with low bottleneck complexity, which we believe may be of interest beyond their applications to secure computation of symmetric functions.

Privacy-Preserving Data Deduplication for Enhancing Federated Learning of Language Models

Deduplication is a vital preprocessing step that enhances machine learning model performance and saves training time and energy. However, enhancing federated learning through deduplication poses challenges, especially regarding scalability and potential privacy violations if deduplication involves sharing all clients’ data. In this paper, we address the problem of deduplication in a federated setup by introducing a pioneering protocol, Efficient Privacy-Preserving Multi-Party Deduplication (EP-MPD). It efficiently removes duplicates from multiple clients’ datasets without compromising data privacy. EP-MPD is constructed in a modular fashion, utilizing two novel variants of the Private Set Intersection protocol. Our extensive experiments demonstrate the significant benefits of deduplication in federated learning of large language models. For instance, we observe up to 19.61% improvement in perplexity and up to 27.95% reduction in running time. EP-MPD effectively balances privacy and performance in federated learning, making it a valuable solution for large-scale applications.

Finding Practical Parameters for Isogeny-based Cryptography

Isogeny-based schemes often come with special requirements on the field of definition of the involved elliptic curves. For instance, the efficiency of SQIsign, a promising candidate in the NIST signature standardisation process, requires a large power of two and a large smooth integer $T$ to divide $p^2-1$ for its prime parameter $p$.
We present two new methods that combine previous techniques for finding suitable primes: sieve-and-boost and XGCD-and-boost. We use these methods to find primes for the NIST submission of SQIsign. Furthermore, we show that our methods are flexible and can be adapted to find suitable parameters for other isogeny-based schemes such as AprèsSQI or POKE. For all three schemes, the parameters we present offer the best performance among all parameters proposed in the literature.

Improved High-Order Masked Generation of Masking Vector and Rejection Sampling in Dilithium

In this work, we introduce enhanced high-order masking techniques tailored for Dilithium, the post-quantum signature scheme recently standardized by NIST. We improve the masked generation of the masking vector $\vec{y}$, based on a fast Boolean-to-arithmetic conversion modulo $q$. We also describe an optimized gadget for the high-order masked rejection sampling, with a complexity independent from the size of the modulus $q$. We prove the security of our gadgets in the classical ISW $t$-probing model. Finally, we detail our open-source C implementation of these gadgets integrated into a fully masked Dilithium implementation, and provide an efficiency comparison with previous works.

A Univariate Attack against the Limited-Data Instance of Ciminion

With the increasing interest for advanced protocols for Multi Party Computation, Fully-Homomorphic Encryption or Zero Knowledge proofs, a need for cryptographic algorithms with new constraints has emerged. These algorithms, called Arithmetization-Oriented ciphers, seek to minimize the number of field multiplications in large finite fields $\mathbb{F}_{2^n}$ or $\mathbb{F}_{p}$. Among them, Ciminion is an encryption algorithm proposed by Dobraunig et al. in Eurocrypt 2021.
In this paper, we show a new univariate modelization on a variant of Ciminion proposed by the designers. This instance restricts the attacker to at most $2^{s/2}$ data, where $s$ is the security level. Because the designers chose to reduce the number of rounds in that specific attacker model, we are able to attack the cipher for large security levels. We also propose some slight modifications of Ciminion that would overcome this vulnerability.

Crypto Dark Matter on the Torus: Oblivious PRFs from shallow PRFs and FHE

Partially Oblivious Pseudorandom Functions (POPRFs) are 2-party protocols that allow a client to learn pseudorandom function (PRF) evaluations on inputs of its choice from a server. The client submits two inputs, one public and one private. The security properties ensure that the server cannot learn the private input, and the client cannot learn more than one evaluation per POPRF query. POPRFs have many applications including password-based key exchange and privacy-preserving authentication mechanisms. However, most constructions are based on classical assumptions, and those with post quantum security suffer from large eﬀiciency drawbacks.
In this work, we construct a novel POPRF from lattice assumptions and the “Crypto Dark Matter” PRF candidate (TCC’18) in the random oracle model. At a conceptual level, our scheme exploits the alignment of this family of PRF candidates, relying on mixed modulus computations, and programmable bootstrapping in the torus fully homomorphic encryption scheme (TFHE). We show that our construction achieves malicious client security based on circuit-private FHE, and client privacy from the semantic security of the FHE scheme. We further explore a heuristic approach to extend our scheme to support verifiability, based on the difficulty of computing cheating circuits in low depth. This would yield a verifiable (P)OPRF. We provide a proof-of-concept implementation and preliminary benchmarks of our construction. For the core online OPRF functionality, we require amortised 10.0KB communication per evaluation and a one-time per-client setup communication of 2.5MB.

On hermitian decomposition lattices and the module-LIP problem in rank 2

In this short note, we introduce a specific class of rank two lattices over CM fields endowed with additional symmetries, which are involved in the decomposition of algebraic integers in Hermitian squares. As an application, we show an elementary reduction from the module-LIP problem in rank 2 over a CM or totally real number field to the finding of a square basis in such lattices.

A reduction from Hawk to the principal ideal problem in a quaternion algebra

In this article we present a non-uniform reduction from rank-2 module-LIP over Complex Multiplication fields, to a variant of the Principal Ideal Problem, in some fitting quaternion algebra. This reduction is classical deterministic polynomial-time in the size of the inputs. The quaternion algebra in which we need to solve the variant of the principal ideal problem depends on the parameters of the module-LIP problem, but not on the problem’s instance. Our reduction requires the knowledge of some special elements of this quaternion algebras, which is why it is non-uniform.
In some particular cases, these elements can be computed in polynomial time, making the reduction uniform. This is in particular the case for the Hawk signature scheme: we show that breaking Hawk is no harder than solving a variant of the principal ideal problem in a fixed quaternion algebra (and this reduction is uniform).

Phase Modulation Side Channels: Jittery JTAG for On-Chip Voltage Measurements

Measuring the fluctuations of the clock phase of a target was identified as a leakage source on early electromagnetic side-channel investigations. Despite this, only recently was directly measuring the clock phase (or jitter) of digital signals from a target connected to being a source of exploitable leakage. As the phase of a clock output will be related to signal propagation delay through the target, and this propagation delay is related to voltage, this means that most digital devices perform an unintended phase modulation (PM) of their internal voltage onto clock output phases.
This paper first demonstrates an unprofiled CPA attack against a Cortex-M microcontroller using the phase of a clock output, observing the signal on both optically isolated and capacitively isolated paths. The unprofiled attack takes only 2-4x more traces than an attack using a classic shunt-resistor measurement.
It is then demonstrated how the JTAG bypass mode can be used to force a clock through a digital device. This forced clock signal can then be used as a highly effective oscilloscope that is located on the target device. As the attack does not require modifications to the device (such as capacitor removal or heat spreader removal) it is difficult to detect using existing countermeasures. The example attack over JTAG uses an unprofiled CPA attack, requiring only about 5x more traces than an ideal shunt-resistor based measurement. In addition, a version of this attack using a fault correlation analysis attack is also demonstrated.
Countermeasures are discussed, and a simple resampling countermeasure is tested. All tools both offensive and defensive presented in the paper have been released under open-source licenses.

Improved Universal Thresholdizer from Iterative Shamir Secret Sharing

The universal thresholdizer, introduced at CRYPTO'18, is a cryptographic scheme that transforms any cryptosystem into a threshold variant, thereby enhancing its applicability in threshold cryptography. It enables black-box construction of one-round threshold signature schemes based on the Learning with Errors problem, and similarly, facilitates one-round threshold ciphertext-attack secure public key encryption when integrated with non-threshold schemes.
Current constructions of universal thresholdizer are fundamentally built upon linear secret sharing schemes.
One approach employs Shamir's secret sharing, which lacks compactness and results in ciphertext sizes of $O(N \log N)$, and another approach uses $\{0,1\}$-linear secret sharing scheme ($\{0,1\}$-LSSS), which is compact but induces high communication costs due to requiring $O(N^{5.3})$ secret shares.
In this work, we introduce a communication-efficient universal thresholdizer by revising the linear secret sharing scheme.
We propose a specialized linear secret sharing scheme, called TreeSSS, which reduces the number of required secret shares
$O(N^{3+o(1)})$ while maintaining
the compactness of the universal thresholdizer.
TreeSSS can also serve as a subroutine for constructing lattice based $t$-out-of-$N$ threshold cryptographic primitives such as threshold fully homomorphic encryptions and threshold signatures. In this context, TreeSSS offers the advantage of lower communication overhead due to the reduced number of secret shares involved.

Round Efficient Byzantine Agreement from VDFs

Byzantine agreement (BA) is a fundamental primitive in distributed systems and has received huge interest as an important building block for blockchain systems. Classical byzantine agreement considers a setting where $n$ parties with fixed, known identities want to agree on an output in the presence of an adversary. Motivated by blockchain systems, the assumption of fixed identities is weakened by using a \emph{resource-based model}. In such models, parties do not have fixed known identities but instead have to invest some expensive resources to participate in the protocol. Prominent examples for such resources are computation (measured by, e.g., proofs-of-work) or money (measured by proofs-of-stake). Unlike in the classical setting where BA without trusted setup (e.g., a PKI or an unpredictable beacon) is impossible for $t \geq n/3$ corruptions, in such resource-based models, BA can be constructed for the optimal threshold of $t <n/2$. In this work, we investigate BA without a PKI in the model where parties have restricted computational resources. Concretely, we consider sequential computation modeled via computing a verifiable delay function (VDF) and establish the following results:
Positive Result: We present the first protocol for BA with expected constant round complexity and termination under adaptive corruption, honest majority and without a PKI. Earlier work achieved round complexity $O(n\kappa^2)$ (CRYPTO'15) or $O(\kappa)$ (PKC'18), where $\kappa$ is the security parameter.
Negative Result: We give the first lower bound on the communication complexity of BA in a model where parties have restricted computational resources. Concretely, we show that a multicast complexity of $O(\sqrt{n})$ is necessary even if the parties have access to a VDF oracle.

SIGMA: Secure GPT Inference with Function Secret Sharing

Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the existing secure inference solutions suffer from high latency and communication overheads,
particularly for transformers. Function secret sharing (FSS) is a recent paradigm for
obtaining efficient 2PC protocols with a preprocessing phase.
We provide SIGMA, the first end-to-end system for secure transformer inference based on FSS.
By constructing new FSS-based protocols for complex machine learning functionalities, such as Softmax, GeLU and SiLU, and also accelerating their computation on GPUs, SIGMA improves the latency of secure inference of transformers by $11-19\times$ over the state-of-the-art that uses preprocessing and GPUs. We present the first secure inference of generative pre-trained transformer (GPT) models. In particular, SIGMA executes Meta's LLaMA2 (available on HuggingFace) with 13 billion parameters in 44 seconds and GPT2 in 1.6 seconds.

Breaking Free: Efficient Multi-Party Private Set Union Without Non-Collusion Assumptions

Multi-party private set union (MPSU) protocol enables $m$ $(m > 2)$ parties, each holding a set, to collectively compute the union of their sets without revealing any additional information to other parties. There are two main categories of MPSU protocols: The first builds on public-key techniques. All existing works in this category involve a super-linear number of public-key operations, resulting in poor practical efficiency. The second builds on oblivious transfer and symmetric-key techniques. The only existing work in this category is proposed by Liu and Gao (ASIACRYPT 2023), which features the best concrete performance among all existing protocols, despite its super-linear computation and communication. Unfortunately, it does not achieve the standard semi-honest security, as it inherently relies on a non-collusion assumption, which is unlikely to hold in practice. Therefore, the problem of constructing a practical MPSU protocol based on oblivious transfer and symmetric-key techniques in standard semi-honest model remains open. Furthermore, there is no MPSU protocol achieving both linear computation and linear communication complexity, which leaves another unresolved problem. In this work, we resolve these two open problems.
- We propose the first MPSU protocol based on oblivious transfer and symmetric-key techniques in the standard semi-honest model. This protocol is $4.9-9.3 \times$ faster than Liu and Gao in the LAN setting. Concretely, our protocol requires only $3.6$ seconds in online phase for 3 parties with sets of $2^{20}$ items each.
- We propose the first MPSU protocol achieving both linear computation and linear communication complexity, based on public-key operations. This protocol has the lowest overall communication costs and shows a factor of $3.0-36.5\times$ improvement in terms of overall communication compared to Liu and Gao.
We implement our protocols and conduct an extensive experiment to compare the performance of our protocols and the state-of-the-art. To the best of our knowledge, our implementation is the first correct and secure implementation of MPSU that reports on large-size experiments.

On the Security of KOS

We study the security of the random oblivious transfer extension protocol of Keller, Orsini, and Scholl (CRYPTO '15), whose security proof was recently invalidated by Roy (CRYPTO '22). We show that KOS is asymptotically secure. Our proof involves a subtle analysis of the protocol's "correlation check", and introduces several new techniques. We also study the protocol's concrete security. We establish concrete security for security parameter values on the order of 5,000. We present evidence that a stronger result than ours—if possible—is likely to require radically new ideas.

STIR: Reed–Solomon Proximity Testing with Fewer Queries

We present STIR (Shift To Improve Rate), an interactive oracle proof of proximity (IOPP) for Reed-Solomon codes that achieves the best known query complexity of any concretely efficient IOPP for this problem. For $\lambda$ bits of security, STIR has query complexity $O(\log d + \lambda \cdot \log \log d )$, while FRI, a popular protocol, has query complexity $O(\lambda \cdot \log d )$ (including variants of FRI based on conjectured security assumptions). STIR relies on a new technique for recursively improving the rate of the tested Reed-Solomon code.
We provide an implementation of STIR compiled to a SNARK. Compared to a highly-optimized implementation of FRI, STIR achieves an improvement in argument size that ranges from $1.25\times$ to $2.46\times$ depending on the chosen parameters, with similar prover and verifier running times. For example, in order to achieve 128 bits of security for degree $2^{26}$ and rate $1/4$, STIR has argument size $114$ KiB, compared to $211$ KiB for FRI.

A Practical and Scalable Implementation of the Vernam Cipher, under Shannon Conditions, using Quantum Noise

The one-time pad cipher is renowned for its theoretical perfect security, yet its practical deployment is primarily hindered by the key-size and distribution challenge. This paper introduces a novel approach to key distribution called q-stream, designed to make symmetric-key cryptography, and the one-time pad cipher in particular, a viable option for contemporary secure communications, and specifically, post-quantum cryptography, leveraging quantum noise and combinatorics to ensure secure and efficient key-distribution between communicating parties. We demonstrate that our key-distribution mechanism has a variable, yet quantifiable hardness of at least 504 bits, established from immutable mathematical laws, rather than conjectured-intractability, and how we overcome the one-time pad key-size issue with a localised quantum-noise seeded key-generation function, having a system hardness of at least 2304 bits, while introducing sender authentication and message integrity. Whilst the proposed solution has potential applications in fields requiring very high levels of security, such as military communications and large financial transactions, we show from our research with a prototype of q-stream, that it is sufficiently practical and scaleable for use in common browser-based web-applications, without any modification to the browser (i.e. plug-ins), running above SSL/TLS at the application level, where in tests, it achieved a key-distribution rate of around 7 million keys over a 5 minute surge-window, in a single (multi-threaded) instance of q-stream.

A Note on ``Secure and Distributed IoT Data Storage in Clouds Based on Secret Sharing and Collaborative Blockchain''

We show that the data storage scheme [IEEE/ACM Trans. Netw., 2023, 31(4), 1550-1565] is flawed due to the false secret sharing protocol, which requires that some random $4\times 4$ matrixes over the finite field $F_p$ (a prime $p$) are invertible. But we find its mathematical proof for invertibility is incorrect. To fix this flaw, one needs to check the invertibility of all 35 matrixes so as to generate the proper 7 secret shares.

LR-OT: Leakage-Resilient Oblivious Transfer

Oblivious Transfer (OT) is a fundamental cryptographic primitive, becoming a crucial component of a practical secure protocol.
OT is typically implemented in software, and one way to accelerate its running time is by using hardware implementations.
However, such implementations are vulnerable to side-channel attacks (SCAs).
On the other hand, protecting interactive protocols against SCA is highly challenging because of their longer secrets (which include inputs and randomness), more complicated design, and running multiple instances.
Consequently, there are no truly practical leakage-resistant OT protocols yet.
In this paper, we introduce two tailored indistinguishability-based security definitions for leakage-resilient OT, focusing on protecting the sender's state.
Second, we propose a practical semi-honest secure OT protocol that achieves these security levels while minimizing the assumptions on the protocol's building blocks and the use of a secret state.
Finally, we extend our protocol to support sequential composition and explore efficiency-security tradeoffs.

Last updated: 2024-07-13

Predicting one class of truncated matrix congruential generators with unknown parameters

Matrix congruential generators is an important class of pseudorandom number
generators. In this paper we show how to predict a class of Matrix congruential generators matrix
congruential generators with unknown parameters. Given a few truncated digits
of high-order bits output by a matrix congruential generator, we give a method
based on lattice reduction to recover the parameters and the initial state of the
generator.

Optimized Privacy-Preserving Clustering with Fully Homomorphic Encryption

Clustering is a crucial unsupervised learning method extensively used in the field of data analysis. For analyzing big data, outsourced computation is an effective solution but privacy concerns arise when involving sensitive information. Fully homomorphic encryption (FHE) enables computations on encrypted data, making it ideal for such scenarios. However, existing privacy-preserving clustering based on FHE are often constrained by the high computational overhead incurred from FHE, typically requiring decryption and interactions after only one iteration of the clustering algorithm. In this work, we propose a more efficient approach to evaluate the one-hot vector for the index of the minimum in an array with FHE, which fully exploits the parallelism of single-instruction-multiple-data of FHE schemes. By combining this with FHE bootstrapping, we present a practical FHE-based k-means clustering protocol whose required round of interactions between the data owner and the server is optimal, i.e., accomplishing the entire clustering process on encrypted data in a single round. We implement this protocol using the CKKS FHE scheme. Experiments show that our protocol significantly outperforms the state-of-the-art FHE-based k-means clustering protocols on various public datasets and achieves comparable accuracy to plaintext result. Additionally, We adapt our protocol to support mini-batch k-means for large-scale datasets and report its performance.

Fault-Resistant Partitioning of Secure CPUs for System Co-Verification against Faults

Fault injection attacks are a serious threat to system security, enabling attackers to bypass protection mechanisms or access sensitive information. To evaluate the robustness of CPU-based systems against these attacks, it is essential to analyze the consequences of the fault propagation resulting from the complex interplay between the software and the processor. However, current formal methodologies combining hardware and software face scalability issues due to the monolithic approach used.
To address this challenge, this work formalizes the $k$-fault resistant partitioning notion to solve the fault propagation problem when assessing redundancy-based hardware countermeasures in a first step. Proven security guarantees can then reduce the remaining hardware attack surface when introducing the software in a second step. First, we validate our approach against previous work by reproducing known results on cryptographic circuits. In particular, we outperform state-of-the-art tools for evaluating AES under a three-fault-injection attack. Then, we apply our methodology to the OpenTitan secure element and formally prove the security of its CPU's hardware countermeasure to single bit-flip injections. Besides that, we demonstrate that previously intractable problems, such as analyzing the robustness of OpenTitan running a secure boot process, can now be solved by a co-verification methodology that leverages a $k$-fault resistant partitioning. We also report a potential exploitation of the register file vulnerability in two other software use cases. Finally, we provide a security fix for the register file, prove its robustness, and integrate it into the OpenTitan project.

Prime Masking vs. Faults - Exponential Security Amplification against Selected Classes of Attacks

Fault injection attacks are a serious concern for cryptographic hardware. Adversaries may extract sensitive information from the faulty output that is produced by a cryptographic circuit after actively disturbing its computation. Alternatively, the information whether an output would have been faulty, even if it is withheld from being released, may be exploited. The former class of attacks, which requires the collection of faulty outputs, such as Differential Fault Analysis (DFA), then either exploits some knowledge about the position of the injected fault or about its value. The latter class of attacks, which can be applied without ever obtaining faulty outputs, such as Statistical Ineffective Fault Attacks (SIFA), then either exploits a dependency between the effectiveness of the fault injection and the value to be faulted (e.g., an LSB stuck-at-0 only affecting odd numbers), denoted as SIFA-1, or a conditional propagation of a faulted value based on a sensitive intermediate (e.g., multiplication of a faulted value by 0 prevents propagation), denoted as SIFA-2. The aptitude of additive masking schemes, which were designed to prevent side-channel analysis, to also thwart fault attacks is typically assumed to be limited. Common fault models, such as toggle/bit-flip, stuck-at-0 or stuck-at-1 survive the recombination of Boolean shares well enough for generic attacks to succeed. More precisely, injecting a fault into one or multiple Boolean shares often results in the same, or at least a predictable, error appearing in the sensitive variable after recombination. In this work, we show that additive masking in prime-order fields breaks such relationships, causing frequently exploited biases to decrease exponentially in the number of shares. As a result, prime masking offers surprisingly strong protection against generic statistical attacks, which require a dependency between the effectiveness of an injected fault and the secret variable that is manipulated, such as SIFA-1. Operation-dependent statistical attacks, such as SIFA-2 and Fault Template Attacks (FTA), may still be performed against certain prime-field structures, even if they are masked with many shares. Yet, we analyze the corresponding cases and are able to provide specific guidelines on how to avoid vulnerabilities either at the cipher design or implementation level by making informed decisions about the primes, non-linear mappings and masked gadgets used. Since prime-field masking appears to be one of the rare instances of affordable countermeasures that naturally provide sound protection against sidechannel analysis and certain fault injection attacks, we believe there is a strong incentive for developing new ciphers to leverage these advantages.

Permutation Superposition Oracles for Quantum Query Lower Bounds

We propose a generalization of Zhandry’s compressed oracle method to random permutations, where an algorithm can query both the permutation and its inverse. We show how to use the resulting oracle simulation to bound the success probability of an algorithm for any predicate on input-output pairs, a key feature of Zhandry’s technique that had hitherto resisted attempts at generalization to random permutations. One key technical ingredient is to use strictly monotone factorizations to represent the permutation in the oracle’s database. As an application of our framework, we show that the one-round sponge construction is unconditionally preimage resistant in the random permutation model. This proves a conjecture by Unruh.

Bake It Till You Make It: Heat-induced Power Leakage from Masked Neural Networks

Masking has become one of the most effective approaches for securing hardware designs against side-channel attacks. Regardless of the effort put into correctly implementing masking schemes on a field-programmable gate array (FPGA), leakage can be unexpectedly observed. This is due to the fact that the assumption underlying all masked designs, i.e., the leakages of different shares are independent of each other, may no longer hold in practice. In this regard, extreme temperatures have been shown to be an important factor in inducing leakage, even in correctly masked designs. This has previously been verified using an external heat generator
(i.e., a climate chamber). In this paper, we examine whether the leakage can be induced using the circuit components themselves without making any changes to the design. Specifically, we target masked neural networks (NNs) in FPGAs, one of the main building blocks of which is block random access memory (BRAM). In this respect, thanks to the inherent characteristics of NNs, our novel internal heat generators leverage solely the memories devoted to storing the user’s input, especially when frequently writing alternating patterns into BRAMs. The possibility of observing first-order leakage is evaluated by considering one of the most recent and successful first-order secure masked NNs, namely ModuloNET. ModuloNET is specifically designed for FPGAs, where BRAMs are used to store inputs and intermediate computations. Our experimental results demonstrate that undesirable first-order leakage can be observed and exploited by increasing the temperature when an alternating input is applied to the masked NN. To give a better understanding of the impact of extreme heat, we further perform a similar test on the design using an external heat generator, where a similar conclusion can be drawn.

Anonymous Outsourced Statekeeping with Reduced Server Storage

Strike-lists are a common technique for rollback and replay prevention in protocols that require that clients remain anonymous or that their current position in a state machine remain confidential. Strike-lists are heavily used in anonymous credentials, e-cash schemes, and trusted execution environments, and are widely deployed on the web in the form of Privacy Pass (PoPETS '18) and Google Private State Tokens.
In such protocols, clients submit pseudorandom tokens associated with each action (e.g., a page view in Privacy Pass) or state transition, and the token is added to a server-side list to prevent reuse.
Unfortunately, the size of a strike-list, and hence the storage required by the server, is proportional to the total number of issued tokens, $N \cdot t$, where $N$ is the number of clients and $t$ is the maximum number of tickets per client. In this work, we ask whether it is possible to realize a strike-list-like functionality, which we call the anonymous tickets functionality, with storage requirements proportional to $N \log(t)$.
For the anonymous tickets functionality we construct a secure protocol from standard assumptions that achieves server storage of $O(N)$ ciphertexts, where each ciphertext encrypts a message of length $O(\log(t))$. We also consider an extension of the strike-list functionality where the server stores an arbitrary state for each client and clients advance their state with some function $s_i\gets f(s_{i-1},\mathsf{auxinput})$, which we call the anonymous outsourced state-keeping functionality. In this setting, malicious clients are prevented from rolling back their state, while honest clients are guaranteed anonymity and confidentiality against a malicious server. We achieve analogous results in this setting for two different classes of functions.
Our results rely on a new technique to preserve client anonymity in the face of selective failure attacks by a malicious server. Specifically, our protocol guarantees that misbehavior of the server either (1) does not prevent the honest client from redeeming a ticket or (2) provides the honest client with an escape hatch that can be used to simulate a redeem in a way that is indistinguishable to the server.

Toward A Practical Multi-party Private Set Union

This paper studies a multi-party private set union (mPSU), a fundamental cryptographic problem that allows multiple parties to compute the union of their respective datasets without revealing any additional information. We propose an efficient mPSU protocol which is secure in the presence of any number of colluding semi-honest participants. Our protocol avoids computationally expensive homomorphic operations or generic multi-party computation, thus providing an efficient solution for mPSU.
The crux of our protocol lies in the utilization of new cryptographic tool, namely, Membership Oblivious Transfer (mOT) . We believe that the mOT and cOPRF may be of independent interest. We implement our mPSU protocol and evaluate their performance. Our protocol shows an improvement of up to $80.84\times$ in terms of running time and $405.73\times$ bandwidth cost compared to the existing state-of-the-art protocols.
*Note: June 2024 - Fixed a bug in the original version and simplified the protocol by removing the OPRF.

A Two-Layer Blockchain Sharding Protocol Leveraging Safety and Liveness for Enhanced Performance

Sharding is a critical technique that enhances the scalability of blockchain technology. However, existing protocols often assume adversarial nodes in a general term without considering the different types of attacks, which limits transaction throughput at runtime because attacks on liveness could be mitigated. There have been attempts to increase transaction throughput by separately handling the attacks; however, they have security vulnerabilities. This paper introduces Reticulum, a novel sharding protocol that overcomes these limitations and achieves enhanced scalability in a blockchain network without security vulnerabilities.
Reticulum employs a two-phase design that dynamically adjusts transaction throughput based on runtime adversarial attacks on either or both liveness and safety. It consists of `control' and `process' shards in two layers corresponding to the two phases. Process shards are subsets of control shards, with each process shard expected to contain at least one honest node with high confidence. Conversely, control shards are expected to have a majority of honest nodes with high confidence. Reticulum leverages unanimous voting in the first phase to involve fewer nodes in accepting/rejecting a block, allowing more parallel process shards. The control shard finalizes the decision made in the first phase and serves as a lifeline to resolve disputes when they surface.
Experiments demonstrate that the unique design of Reticulum empowers high transaction throughput and robustness in the face of different types of attacks in the network, making it superior to existing sharding protocols for blockchain networks.

Dot-Product Proofs and Their Applications

A dot-product proof (DPP) is a simple probabilistic proof system in which the input statement $\mathbf{x}$ and the proof $\boldsymbol{\pi}$ are vectors over a finite field $\mathbb{F}$, and the proof is verified by making a single dot-product query $\langle \mathbf{q},(\mathbf{x} \| \boldsymbol{\pi}) \rangle$ jointly to $\mathbf{x}$ and $\boldsymbol{\pi}$. A DPP can be viewed as a 1-query fully linear PCP. We study the feasibility and efficiency of DPPs, obtaining the following results:
- Small-field DPP. For any finite field $\mathbb{F}$ and Boolean circuit $C$ of size $S$, there is a DPP for proving that there exists $\mathbf{w}$ such that $C(\mathbf{x}, \mathbf{w})=1$ with a proof $\boldsymbol{\pi}$ of length $S\cdot\mathsf{poly}(|\mathbb{F}|)$ and soundness error $\varepsilon=O(1 / \sqrt{|\mathbb{F}|})$. We show this error to be asymptotically optimal. In particular, and in contrast to the best known PCPs, there exist strictly linear-length DPPs over constant-size fields.
- Large-field DPP. If $|\mathbb{F}|\ge\mathsf{poly}(S/\varepsilon)$, there is a similar DPP with soundness error $\varepsilon$ and proof length $O(S)$ (in field elements).
The above results do not rely on the PCP theorem and their proofs are considerably simpler. We apply our DPP constructions toward two kinds of applications.
- Hardness of approximation. We obtain a simple proof for the NP-hardness of approximating MAXLIN (with dense instances) over any finite field $\mathbb{F}$ up to some constant factor $c>1$, independent of $\mathbb{F}$. Unlike previous PCP-based proofs, our proof yields exponential-time hardness under the exponential time hypothesis (ETH).
- Succinct arguments. We improve the concrete efficiency of succinct interactive arguments in the generic group model using input-independent preprocessing. In particular, the communication is comparable to sending two group elements and the verifier's computation is dominated by a single group exponentiation. We also show how to use DPPs together with linear-only encryption to construct succinct commit-and-prove arguments.

Secret Key Recovery in a Global-Scale End-to-End Encryption System

End-to-end encrypted messaging applications ensure that an attacker cannot read a user's message history without their decryption keys. While this provides strong privacy, it creates a usability problem: if a user loses their devices and cannot access their decryption keys, they can no longer access their account. To solve this usability problem, users should be able to back up their account information with the messaging provider. For privacy, this backup should be encrypted and the provider should not have access to users' decryption keys. To solve this problem, we present Secure Value Recovery 3 (SVR3), a secret key recovery system that distributes trust across different types of hardware enclaves run by different cloud providers in order to protect users' decryption keys. SVR3 is the first deployed secret key recovery system to split trust across heterogeneous enclaves managed by different cloud providers: this design ensures that a single type of enclave does not become a central point of attack. SVR3 protects decryption keys via rollback protection and fault tolerance techniques tailored to the enclaves' security guarantees. SVR3 costs \$0.0025/user/year and takes 365ms for a user to recover their key, which is a rare operation. A part of SVR3 has been rolled out to millions of real users in a deployment with capacity for over 500 million users, demonstrating the ability to operate at scale.

Cryptanalysis of EagleSign

EagleSign is one of the 40 “Round 1 Additional Signatures” that is accepted for consideration in the supplementary round of the Post-Quantum Cryptography standardization process, organized by NIST. Its design is based on structured lattices, and it boasts greater simplicity and performance compared to the two lattice signatures already selected for standardization: Falcon and Dilithium.
In this paper, we show that those claimed advantages come at the cost of security. More precisely, we show that the distribution of EagleSign signatures leaks information about the private key, to the point that only a few hundred signatures on arbitrary known messages suffice for a full key recovery, for all proposed parameters.
A related vulnerability also affects EagleSign-V2, a subsequent version of the scheme specifically designed to thwart the initial attack. Although a larger number of signatures is required for key recovery, the idea of the attack remains largely similar. Both schemes come with proofs of security that we show are flawed.

Probabilistic Linearization: Internal Differential Collisions in up to 6 Rounds of SHA-3

The SHA-3 standard consists of four cryptographic hash functions, called SHA3-224, SHA3-256, SHA3-384 and SHA3-512, and two extendable-output functions (XOFs), called SHAKE128 and SHAKE256. In this paper, we study the collision resistance of the SHA-3 instances. By analyzing the nonlinear layer, we introduce the concept of maximum difference density subspace, and develop a new target internal difference algorithm by probabilistic linearization. We also exploit new strategies for optimizing the internal differential characteristic. Further more, we figure out the expected size of collision subsets in internal differentials, by analyzing the collision probability of the digests rather than the intermediate states input to the last nonlinear layer. These techniques enhance the analysis of internal differentials, leading to the best collision attacks on four round-reduced variants of the SHA-3 instances. In particular, the number of attacked rounds is extended to 5 from 4 for SHA3-384, and to 6 from 5 for SHAKE256.

Counting Unpredictable Bits: A Simple PRG from One-way Functions

A central result in the theory of Cryptography, by Hastad, Imagliazzo, Luby and Levin [SICOMP’99], demonstrates that the existence one-way functions (OWF) implies the existence of pseudo-random generators (PRGs). Despite the fundamental importance of this result, and several elegant improvements/simplifications, analyses of constructions of PRGs from OWFs remain complex (both conceptually and technically).
Our goal is to provide a construction of a PRG from OWFs with a simple proof of security; we thus focus on the setting of non-uniform security (i.e., we start off with a OWF secure against non-uniform PPT, and we aim to get a PRG secure against non-uniform PPT).
Our main result is a construction of a PRG from OWFs with a self-contained, simple, proof of security, relying only on the Goldreich-Levin Theorem (and the Chernoff bound). Although our main goal is simplicity, the construction, and a variant there-of, also improves the efficiency—in terms of invocations and seed lengths—of the state-of-the-art constructions due to [Haitner-Reingold-Vadhan, STOC’10] and [Vadhan-Zheng, STOC’12], by a factor $O(\log^2 n)$.
The key novelty in our analysis is a generalization of the Blum-Micali [FOCS’82] notion of unpredictabilty—rather than requiring that every bit in the output of a function is unpredictable, we count how many unpredictable bits a function has, and we show that any OWF on $n$ input bits (after hashing the input and the output) has $n + O(\log n)$ unpredictable output bits. Such unpredictable bits can next be “extracted” into a pseudorandom string using standard techniques.

Scalable and Lightweight State-Channel Audits

Payment channels are one of the most prominent off-chain scaling solutions for blockchain systems. However, regulatory institutions have difficulty embracing them, as the channels lack insights needed for Anti-Money Laundering (AML) auditing purposes. Our work tackles the problem of a formal reliable and controllable inspection of off-ledger payment channels, by offering a novel approach for maintaining and reliably auditing statistics of payment channels. We extend a typical trustless Layer 2 protocol and provide a lightweight and scalable protocol such that:
- every state channel is provably auditable w.r.t. a configurable set of policy queries, such that a regulator can retrieve reliable insights about the channel;
- no information beyond the answers to auditing queries is leaked;
- the cryptographic operations are inexpensive, the setup is simple, and storage complexity is independent of the transaction graph's size.
We present a concrete protocol, based on Hydra Isomorphic State Channels (FC'21), and tie the creation of a state channel to real-world identifiers, both in a plain and privacy-preserving manner. For this, we employ verifiable credentials for decentralized identifiers, specifically verifiable Legal Entity Identifiers (vLEI) that increasingly gain traction for financial service providers and regulated institutions.

Exploiting signature leakages: breaking Enhanced pqsigRM

Enhanced pqsigRM is a code-based hash-and-sign scheme proposed to the second National Institute of Standards and Technology call for post-quantum signatures. The scheme is based on the $(U,U+V)$-construction and it enjoys remarkably small signature lengths, about $1$KBytes for a security level of $128$ bits. Unfortunately we show that signatures leak information about the underlying $(U,U+V)$-structure. It allows to retrieve the private-key with~$100, 000$ signatures.

SQISignHD: New Dimensions in Cryptography

We introduce SQIsignHD, a new post-quantum digital signature scheme inspired by SQIsign.
SQIsignHD exploits the recent algorithmic breakthrough underlying the attack on SIDH, which allows to efficiently represent isogenies of arbitrary degrees as components of a higher dimensional isogeny. SQIsignHD overcomes the main drawbacks of SQIsign. First, it scales well to high security levels, since the public parameters for SQIsignHD are easy to generate: the characteristic of the underlying field needs only be of the form $2^{f}3^{f'}-1$. Second, the signing procedure is simpler and more efficient. Our signing procedure implemented in C runs in 28 ms, which is a significant improvement compared to SQISign. Third, the scheme is easier to analyse, allowing for a much more compelling security reduction. Finally, the signature sizes are even more compact than (the already record-breaking) SQIsign, with compressed signatures as small as 109 bytes for the post-quantum NIST-1 level of security.
These advantages may come at the expense of the verification, which now requires the computation of an isogeny in dimension $4$, a task whose optimised cost is still uncertain, as it has been the focus of very little attention. Our experimental sagemath implementation of the verification runs in around 600 ms, indicating the potential cryptographic interest of dimension $4$ isogenies after optimisations and low level implementation.

Parameters of Algebraic Representation vs. Efficiency of Algebraic Cryptanalysis

The aim of an algebraic attack is to find the secret key by solving
a collection of relations that describe the internal structure of a cipher
for observations of plaintext/cipher-text pairs.
Although algebraic attacks are addressed for cryptanalysis of block and
stream ciphers, there is a limited understanding of the impact of algebraic
representation of the cipher on the efficiency of solving the resulting collection of equations.
In this paper, we investigate on how different S-box representations affect
the complexity of algebraic attacks, in an empirical manner.
In the literature some algebraic properties are intuitively proposed to evaluate optimality of an algebraic description of S-boxes for algebraic cryptanalysis.
In this paper, we compare different S-box representation for algebraic
cryptanalysis with doing experiments with SR family of block ciphers.
We also show that the so-called \textit{Forward-Backward} representation which is in contrast with all mentioned criteria for optimal representations criteria, practically gives better results than the compliant representations.
We also compare the representations for both $GF(2)$ and $GF(2^n)$ fields.

KAIME : Central Bank Digital Currency with Realistic and Modular Privacy

Recently, with the increasing interest in Central Bank Digital Currency (CBDC), many countries have been working on researching and developing digital currency. The most important reasons for this interest are that CBDC eliminates the disadvantages of traditional currencies and provides a safer, faster, and more efficient payment system. These benefits also come with challenges, such as safeguarding individuals’ privacy and ensuring regulatory mechanisms. While most researches address the privacy conflict between users and regulatory agencies, they miss an important detail. Important parts of a financial system are banks and financial institutions. Some studies ignore the need for privacy and include these institutions in the CBDC system, no system currently offers a solution to the privacy conflict between banks, financial institutions, and users. In this study, while we offer a solution to the privacy conflict between the user and the regulatory agencies, we also provide a solution to the privacy conflict between the user and the banks. Our solution, KAIME has also a modular structure. The privacy of the sender and receiver can be hidden if desired. Compared to previous related research, security analysis and implementation of KAIME is substantially simpler because simple and well-known cryptographic methods are used.

BG: A Modular Treatment of BFT Consensus

Uncategorized

Uncategorized

We provide an expressive framework that allows analyzing and generating provably secure, state-of-the-art Byzantine fault-tolerant (BFT) protocols. Our framework is hierarchical, including three layers. The top layer is used to model the message pattern and abstract key functions on which BFT algorithms can be built. The intermediate layer provides the core functions with high-level properties sufficient to prove the security of the top-layer algorithms. The bottom layer carefully defines predicates according to which we offer operational realizations for the core functions. All three layers in our framework are extensible and enable innovation. One may modify or extend any layer to theoretically cover all BFT protocols, known and unknown. Indeed, unlike prior BFT frameworks, our framework can analyze and recast BFT protocols in an exceedingly fine-grained manner. More importantly, our framework can readily generate new BFT protocols by simply enumerating the parameters in the framework. In this paper, we show that the framework allows us to fully specify and formally prove the security for 23 BFT protocols, including protocols matching HotStuff, Fast-HotStuff, Jolteon, and Marlin, and among these protocols, seven new protocols outperforming existing ones or achieving meaningful trade-offs among various performance metrics.

Synchronous Distributed Key Generation without Broadcasts

Distributed key generation (DKG) is an important building block in designing many efficient distributed protocols. In this work, we initiate the study of communication complexity and latency of distributed key generation protocols under a synchronous network in a point-to-point network. Our key result is the first synchronous DKG protocol for discrete log-based cryptosystems with $O(\kappa n^3)$ communication complexity ($\kappa$ denotes a security parameter) that tolerates $t < n/2$ Byzantine faults among $n$ parties. We show two variants of the protocol: a deterministic protocol with $O(t\Delta)$ latency and randomized protocol with $O(\Delta)$ latency in expectation where $\Delta$ denotes the bounded synchronous delay. In the process of achieving our results, we design (1) a gradecast protocol with optimal communication complexity of $O(\kappa n^2)$ for linear-sized inputs and latency of $O(\Delta)$, (2) a primitive called ``recoverable set of shares'' for ensuring recovery of shared secrets, (3) an oblivious leader election protocol with $O(\kappa n^3)$ communication and $O(\Delta)$ latency, and (4) a multi-valued validated Byzantine agreement (MVBA) protocol with $O(\kappa n^3)$ communication complexity for linear-sized inputs and $O(\Delta)$ latency in expectation. Each of these primitives may be of independent interest.

Constraint-Packing and the Sum-Check Protocol over Binary Tower Fields

SNARKs based on the sum-check protocol often invoke the ``zero-check PIOP''. This reduces the vanishing of many constraints to a single sum-check instance applied to an $n$-variate polynomial of the form $g(x) = \text{eq}(r,x) \cdot p(x)$, where $p$ is a product of multilinear polynomials, $r$ is a random vector, and $\text{eq}$ is the multilinear extension of the equality function. In recent SNARK designs, $p(x)$ is defined over a ``small'' base field, while $r$ is drawn from a large extension field $\mathbb{F}$ for security.
Recent papers (Bagad, Domb, and Thaler 2024; Gruen 2024) have optimized the sum-check protocol prover for this setting. However, these works still require the prover to ``pre-compute'' all evaluations of $\text{eq}(r, x)$ as $x$ ranges over $\{0, 1\}^{n}$,
and this computation involves about $n$ multiplications over the extension field $\mathbb{F}$.
In this note, we describe a modification to the zero-check PIOP in the case of binary tower fields that reduces this pre-computation cost by a factor of close to $\log |\mathbb{F}|$, which is $128$ in important applications. We show that our modification is sound, and that it strictly generalizes a (possibly folklore) technique of constraint-packing over field extensions.

Jolt-b: recursion friendly Jolt with basefold commitment

The authors of Jolt [AST24] pioneered a unique method for creating zero-knowledge virtual machines, known as the lookup singularity. This technique extensively uses lookup tables to create virtual machine circuits. Despite Jolt’s performance being twice as efficient as the previous state-of-the-art1 , there is potential for further enhancement.
The initial release of Jolt uses Spartan [Set20] and Hyrax [WTs+ 18] as their backend, leading to two constraints. First, Hyrax employs Pedersen commitment to build inner product arguments, which requires elliptic curve operations. Second, the verification of a Hyrax commitment takes square root time $O(\sqrt{N})$ relative to the circuit size $N$ . This makes the recursive verification of a Jolt proof impractical, as the verification circuit would need to execute all the Hyrax verification logic in-circuit. A later version of Jolt includes Zeromorph [KT23] and HyperKZG as their commitment backend, making the system recursion-friendly, as now the recursive verifier only needs to perform $O(\log N)$ operations, but at the
expense of a need for a trusted setup.
Our scheme, Jolt-b, addresses these issues by transitioning to the extension field of the Goldilocks and using the Basefold commitment scheme [ZCF23], which has an $O(\log^2 N)$ verifier time. This scheme mirrors the modifications of Plonky2 over the original Plonk [GWC19]: it transitions from EC fields to the Goldilocks field; it replaces the EC-based commitment scheme with an encoding-based commitment scheme.
We implemented Jolt-b, along with an optimized version of the Basefold scheme. Our benchmarks show that at a cost of 2.47x slowdown for the prover, we achieve recursion friendliness for the original Jolt. In comparison with other recursion-friendly Jolt variants, our scheme is 1.24x and 1.52x faster in prover time than the Zeromorph and HyperKZG variants of Jolt, respectively.

Distributed Verifiable Random Function With Compact Proof

Verifiable Random Functions (VRFs) are cryptographic primitives that generate unpredictable randomness along with proofs that are verifiable, a critical requirement for blockchain applications in decentralized finance, online gaming, and more. Existing VRF constructions often rely on centralized entities, creating security vulnerabilities. Distributed VRFs (DVRFs) offer a decentralized alternative but face challenges like large proof sizes or dependence on computationally expensive bilinear pairings.
In this research, a unique distributed VRF (DVRF) system called DVRFwCP with considerable improvements is proposed. DVRFwCP has constant-size proofs, which means that the size of the proof does not change based on the number of participants. This overcomes a significant drawback of earlier DVRF systems, which saw proof size increase with participant count. Furthermore, DVRFwCP produces more efficient verification than previous systems by eliminating the requirement for bilinear pairings throughout the verification process. These innovations contribute to a more secure and scalable solution for generating verifiable randomness in decentralized environments.
We compare our construction to well-established DVRF instantiations such as DDH-DVRF and GLOW-DVRF while also pointing out the major improvement in the estimated gas cost of these algorithms.

Proof-Carrying Data from Multi-folding Schemes

Proof-carrying data (PCD) is a cryptographic primitive enabling mutually distrustful parties to perform distributed computations on directed acyclic graphs with efficient and incremental verification. Key performance metrics include the prover cost at each step and the recursion overhead, which measures the additional cost beyond proving the original computation. Despite substantial advancements in constructing efficient PCD schemes, these metrics continue to be bottlenecks hindering their widespread application.
In this paper, we advance the research by constructing a new PCD scheme based on a new generalized construction of multi-folding schemes. Compared with the state-of-the-art PCD scheme by Bünz et al. (CRYPTO'21), our scheme reduces the prover cost at each step from $4r+6$ multi-scalar multiplications (MSMs) of size $O(|C|)$ to $1$ MSM of the same size, and the recursion overhead from $6$ MSMs of size $2r-1$, $1$ MSM of size $6r-3$ to $1$ MSM of size $2r-1$, where $r$ is the number of incoming edges at certain step and $|C|$ is the proving computation size. Additionally, our PCD scheme supports a more expressive constraint system for encoding computations, namely the customizable constraint system, which supports high-degree constraints, in contrast to the rank-1 constraint system adopted by existing PCD schemes that only supports quadratic constraints.
We implement our PCD scheme and report the concrete recursion overhead and practical efficiency for different values of $r$ and $|C|$. Compared with Bünz et al. (CRYPTO'21), our PCD scheme achieves a $2.5$ times lower recursion overhead when $r=2$ and $|C|=2^{20}$. Additionally, when $r=2$ and a proving computation size (excluding recursion overhead) of $2^{24}$, it takes $49$ seconds to generate a PCD proof at each step. Using a SNARK to compress the proof reduces the proof size from $1031$ MB to $13$ KB, with a tradeoff in the verifier time, which increases from $11$ seconds to $22$ seconds.

PARScoin: A Privacy-preserving, Auditable, and Regulation-friendly Stablecoin

Stablecoins are digital assets designed to maintain a consistent value relative to a reference point, serving as a vital component in Blockchain, and Decentralized Finance (DeFi) ecosystem. Typical implementations of stablecoins via smart contracts come with important downsides such as a questionable level of privacy, potentially high fees, and lack of scalability. We put forth a new design, PARScoin, for a Privacy-preserving, Auditable, and Regulation-friendly Stablecoin that mitigates these issues while enabling high performance both in terms of speed of settlement and for scaling to large numbers of users as our performance analysis demonstrates. Our construction is blockchain-agnostic and is analyzed in the Universal Composition (UC) framework, offering a secure and modular approach for its integration into the broader blockchain ecosystem.

A New PPML Paradigm for Quantized Models

Model quantization has become a common practice in machine learning (ML) to improve efficiency and reduce computational/communicational overhead. However, adopting quantization in privacy-preserving machine learning (PPML) remains challenging due to the complex internal structure of quantized operators, which leads to inefficient protocols under the existing PPML frameworks.
In this work, we propose a new PPML paradigm that is tailor-made for and can benefit from quantized models. Our main observation is that lookup tables can ignore the complex internal constructs of any functions which can be used to simplify the quantized operator evaluation. We view the model inference process as a sequence of quantized operators, and each operator is implemented by a lookup table. We then develop an efficient private lookup table evaluation protocol, and its online communication cost is only $\log n$, where $n$ is the size of the lookup table.
On a single CPU core, our protocol can evaluate $2^{15}$ tables with 8-bit input and 8-bit output per second.
The resulting PPML framework for quantized models offers extremely fast online performance.
The experimental results demonstrate that our quantization strategy achieves substantial speedups over SOTA PPML solutions, improving the online performance by $40\sim 60 \times$ w.r.t. convolutional neural network (CNN) models, such as AlexNet, VGG16, and ResNet18, and by $10\sim 25 \times$ w.r.t. large language models (LLMs), such as GPT-2, GPT-Neo, and Llama2.

Attribute-Based Signatures for Circuits with Optimal Parameter Size from Standard Assumptions

Attribute-based signatures (ABS) allow users to simultaneously sign messages and prove their possession of some attributes while hiding the attributes and revealing only the fact that they satisfy a public policy. In this paper, we propose a generic construction of ABS for circuits of unbounded depth and size with optimal parameter size, meaning that the lengths of public parameters, keys, and signatures are all constant. Our generic construction can be instantiated from various standard assumptions including LWE or DLIN. Only previous ABS construction with optimal parameter size necessitates succinct non-interactive argument of knowledge, which can be only constructed from non-standard assumptions. Our generic construction is based on RAM delegations, which intuitively allows us to compress the evaluation of a circuit when inputs are public. In high level, we find a way to compress the computation of the policy circuit on input a user attribute to achieve overall parameter size, while hiding the user policy at the same time.

LLRing: Logarithmic Linkable Ring Signatures with Transparent Setup

Linkable ring signatures are an important cryptographic primitive for anonymized applications, such as e-voting, e-cash and confidential transactions. To eliminate backdoor and overhead in a trusted setup, transparent setup in the discrete logarithm or pairing settings has received considerable attention in practice. Recent advances have improved the proof sizes and verification efficiency of linkable ring signatures with a transparent setup to achieve logarithmic bounds. Omniring (CCS '19) and RingCT 3.0 (FC '20) proposed linkable ring signatures in the discrete logarithm setting with logarithmic proof sizes with respect to the ring size, whereas DualDory (ESORICS '22) achieves logarithmic verifiability in the pairing setting. We make three novel contributions in this paper to improve the efficiency and soundness of logarithmic linkable ring signatures: (1) We identify an attack on DualDory that breaks its linkability. (2) To eliminate such an attack, we present a new linkable ring signature scheme in the pairing setting with logarithmic verifiability. (3) We also improve the verification efficiency of linkable ring signatures in the discrete logarithm setting, by a technique of reducing the number of group exponentiations for verification in Omniring by 50%. Furthermore, our technique is applicable to general inner-product relation proofs, which might be of independent interest. Finally, we empirically evaluate our schemes and compare them with the extant linkable ring signatures in concrete implementation.

Cryptiny: Compacting Cryptography for Space-Restricted Channels and its Use-case for IoT-E2EE

Uncategorized

Uncategorized

We present a novel cryptographic paradigm denoted ``cryptiny:'' Employing a single cryptographic value for several security goals, thus ``compacting'' the communication sent over a space-restricted (narrow) channel, while still proving security. Cryptiny is contrary to the classical cryptographic convention of using a separate cryptographic element for each security goal.
Demonstrating the importance of cryptiny, we employ it for securing a critical IoT configuration in which a broadcasting ``thing'' (called beacon) operates within stringent bandwidth constraints. In this setting, a compact BLE-broadcasting beacon lacking Internet connectivity efficiently directs brief (non fragmented) messages to its remotely pre-paired owner in real-time. Communication transpires through BLE-to-IP gateway devices denoted observers, (typically smartphones in the beacon's vicinity), and subsequently via a cloud app server. The gateway device as well, piggybacks on the transmission a secure and private message to the owner. This configuration is a generic setting for the current and future IoT real-time ecosystems, where billion of owners, beacons, and observers operate.
The configuration instances (analogous to TLS instances over the Internet) imposes high security and privacy demands. We prove that our cryptiny-based protocol for securing the above configuration achieves CCA-secrecy for the beacon's and the observer's messages with backward and forward security for the observer's message, as well simultaneously achieving mutual privacy for beacons and for observers. Achieving backward and forward security is important since beacon devices may be far from their owners for a long duration and may be passively tampered with. In addition, for the backward security proof we develop a new encryption scheme we call ``shifted-DHIES'' (``SDHIES'' for short), which generalizes DHIES. An interesting feature of SDHIES is that encryption is performed with a function of the public key rather than the public key itself.

Generalized Indifferentiable Sponge and its Application to Polygon Miden VM

Cryptographic hash functions are said to be the work-horses of modern cryptography. One of the strongest approaches to assess a cryptographic hash function's security is indifferentiability. Informally, indifferentiability measures to what degree the function resembles a random oracle when instantiated with an ideal underlying primitive. However, proving the indifferentiability security of hash functions has been challenging due to complex simulator designs and proof arguments. The Sponge construction is one of the prevalent hashing method used in various systems. The Sponge has been shown to be indifferentiable from a random oracle when initialized with a random permutation.
In this work, we first introduce $\mathsf{GSponge}$, a generalized form of the Sponge construction offering enhanced flexibility in input chaining, field sizes, and padding types. $\mathsf{GSponge}$ not only captures all existing sponge variants but also unveils new, efficient ones. The generic structure of $\mathsf{GSponge}$ facilitates the discovery of two micro-optimizations for already deployed sponges. Firstly, it allows a new padding rule based on zero-padding and domain-separated inputs, saving one full permutation call in certain cases without increasing the generation time of zero-knowledge proofs. Secondly, it allows to absorb up to $\mathsf{c}/2$ more elements (that can save another permutation call for certain message lengths) without compromising the indifferentiability security. These optimizations enhance hashing time for practical use cases such as Merkle-tree hashing and short message processing.
We then propose a new efficient instantiation of $\mathsf{GSponge}$ called $\mathsf{Sponge2}$ capturing these micro-optimizations and provide a formal indifferentiability proof to establish both $\mathsf{Sponge2}$ and $\mathsf{GSponge}$'s security. This proof, simpler than the original for Sponges, offers clarity and ease of understanding for real-world practitioners. Additionally, it is demonstrated that $\mathsf{GSponge}$ can be safely instantiated with permutations defined over large prime fields, a result not previously formally proven.

FIN: Practical Signature-Free Asynchronous Common Subset in Constant Time

Asynchronous common subset (ACS) is a powerful paradigm enabling applications such as Byzantine fault-tolerance (BFT) and multi-party computation (MPC). The most efficient ACS framework in the information-theoretic setting is due to Ben-Or, Kelmer, and Rabin (BKR, 1994). The BKR ACS protocol has been both theoretically and practically impactful. However, the BKR protocol has an $O(\log n)$ running time (where $n$ is the number of replicas) due to the usage of $n$ parallel asynchronous binary agreement (ABA) instances, impacting both performance and scalability. Indeed, for a network of 16$\sim$64 replicas, the parallel ABA phase occupies about 95%$\sim$97% of the total runtime in BKR. A long-standing open problem is whether we can build an ACS framework with $O(1)$ time while not increasing the message or communication complexity of the BKR protocol.
In this paper, we resolve the open problem, presenting the first constant-time ACS protocol with $O(n^3)$ messages in the information-theoretic and signature-free settings. Moreover, as a key ingredient of our new ACS framework and an interesting primitive in its own right, we provide the first information-theoretic multivalued validated Byzantine agreement (MVBA) protocol with $O(1)$ time and $O(n^3)$ messages. Both results can improve---asymptotically and concretely---various applications using ACS and MVBA in the information-theoretic, quantum-safe, or signature-free settings. As an example, we implement FIN, a BFT protocol instantiated using our framework. Via a 121-server deployment on Amazon EC2, we show FIN is significantly more efficient than PACE (CCS 2022), the state-of-the-art asynchronous BFT protocol of the same type. In particular, FIN reduces the overhead of the ABA phase to as low as 1.23% of the total runtime, and FIN achieves up to 3.41x the throughput of PACE. We also show that FIN outperforms other BFT protocols with the standard liveness property such as Dumbo and Speeding Dumbo.

A Note on Efficient Computation of the Multilinear Extension

The multilinear extension of an $m$-variate function $f : \{0,1\}^m \to \mathbb{F}$, relative to a finite field $\mathbb{F}$, is the unique multilinear polynomial $\hat{f} : \mathbb{F}^m \to \mathbb{F}$ that agrees with $f$ on inputs in $\{0,1\}^m$.
In this note we show how, given oracle access to $f : \{0,1\}^m \to \mathbb{F}$ and a point $z \in \mathbb{F}^m$, to compute $\hat{f}(z)$ using exactly $2^{m+1}$ multiplications, $2^m$ additions and $O(m)$ additional operations. The amount of space used corresponds to $O(m)$ field elements.

QuickPool: Privacy-Preserving Ride-Sharing Service

Online ride-sharing services (RSS) have become very popular owing to increased awareness of environmental concerns and as a response to increased traffic congestion. To request a ride, users submit their locations and route information for ride matching to a service provider (SP), leading to possible privacy concerns caused by leakage of users' location data. We propose QuickPool, an efficient SP-aided RSS solution that can obliviously match multiple riders and drivers simultaneously, without involving any other auxiliary server. End-users, namely, riders and drivers share their route information with SP as encryptions of the ordered set of points-of-interest (PoI) of their route from their start to end locations. SP performs a zone based oblivious matching of drivers and riders, based on partial route overlap as well as proximity of start and end points. QuickPool is in the semi-honest setting, and makes use of secure multi-party computation. We provide security proof of our protocol, perform extensive testing of our implementation and show that our protocol simultaneously matches multiple drivers and riders very efficiently. We compare the performance of QuickPool with state-of-the-art works and observe a run time improvement of 1.6 - 2$\times$, and communication improvement of at least 8$\times$.

The Insecurity of SHA2 under the Differential Fault Characteristic of Boolean Functions

SHA2 has been widely adopted across various traditional public-key cryptosystems, post-quantum cryptography, personal identification, and network communication protocols, etc. Hence, ensuring the robust security of SHA2 is of critical importance. There have been several differential fault attacks based on random word faults targeting SHA1 and SHACAL-2. However, extending such random word-based fault attacks to SHA2 proves significantly more difficult due to the heightened complexity of the boolean functions in SHA2.
In this paper, assuming random word faults, we find some distinctive differential properties within the boolean functions in SHA2. Leveraging these findings, we propose a new differential fault attack methodology that can be effectively utilized to recover the final message block and its corresponding initial vector in SHA2, forge HMAC-SHA2 messages, extract the key of SHACAL-2, and extend our analysis to similar algorithm like SM3. We validate the effectiveness of these attacks through rigorous simulations and theoretical deductions, revealing that they indeed pose substantial threats to the security of SHA2. In our simulation-based experiments, our approach necessitates guessing $T$ bits within a register, with $T$ being no more than $5$ at most, and having a approximate $95\%$ (for SHA512) probability of guessing just $1$ bit. Moreover, upon implementing a consecutive series of 15 fault injections, the success probability for recovering one register (excluding the guessed bits) approaches $100\%$. Ultimately, approximately 928 faulty outputs based on random word faults are required to carry out the attack successfully.

Curl: Private LLMs through Wavelet-Encoded Look-Up Tables

Recent advancements in transformers have revolutionized machine learning, forming the core of Large language models (LLMs). However, integrating these systems into everyday applications raises privacy concerns as client queries are exposed to model owners. Secure multiparty computation (MPC) allows parties to evaluate machine learning applications while keeping sensitive user inputs and proprietary models private. Due to inherent MPC costs, recent works introduce model-specific optimizations that hinder widespread adoption by machine learning researchers. CrypTen (NeurIPS'21) aimed to solve this problem by exposing MPC primitives via common machine learning abstractions such as tensors and modular neural networks. Unfortunately, CrypTen and many other MPC frameworks rely on polynomial approximations of the non-linear functions, resulting in high errors and communication complexity.
This paper introduces Curl, an easy-to-use MPC framework that evaluates non-linear functions as lookup tables, resulting in better approximations and significant round and communication reduction. Curl exposes a similar programming model as CrypTen and is highly parallelizable through tensors. At its core, Curl relies on discrete wavelet transformations to reduce the lookup table size without sacrificing accuracy, which results in up to $19\times$ round and communication reduction compared to CrypTen for non-linear functions such as logarithms and reciprocals. We evaluate Curl on a diverse set of LLMs, including BERT, GPT-2, and GPT Neo, and compare against state-of-the-art related works such as Iron (NeurIPS'22) and Bolt (S&P'24) achieving at least $1.9\times$ less communication and latency.
Finally, we resolve a long-standing debate regarding the security of widely used probabilistic truncation protocols by proving their security in the stand-alone model. This is of independent interest as many related works rely on this truncation style.

HRA-Secure Homomorphic Lattice-Based Proxy Re-Encryption with Tight Security

We construct an efficient proxy re-encryption (PRE) scheme secure against honest re-encryption attacks (HRA-secure) with precise concrete security estimates. To get these precise concrete security estimates, we introduce the tight, fine-grained noise-flooding techniques of Li et al. (CRYPTO'22) to RLWE-based (homomorphic) PRE schemes, as well as a mixed statistical-computational security to HRA security analysis. Our solution also supports homomorphic operations on the ciphertexts. Such homomorphism allows for advanced applications, e.g., encrypted computation of network statistics across networks and unlimited hops, in the case of full homomorphism, i.e., bootstrapping.
We implement our PRE scheme in the OpenFHE software library and apply it to a problem of secure multi-hop data distribution in the context of 5G virtual network slices. We also experimentally evaluate the performance of our scheme, demonstrating that the implementation is practical.
In addition, we compare our PRE method with other lattice-based PRE schemes and approaches to achieve HRA security. These achieve HRA security, but not in a tight, practical scheme such as our work. Further, we present an attack on the PRE scheme proposed in Davidson et al.'s (ACISP'19), which was claimed to achieve HRA security without noise flooding.

DiLizium 2.0: Revisiting Two-Party Crystals-Dilithium

In previous years there has been an increased interest in designing threshold signature schemes. Most of the recent works focus on constructing threshold versions of ECDSA or Schnorr signature schemes due to their appealing usage in blockchain technologies. Additionally, a lot of research is being done on cryptographic schemes that are resistant to quantum computer attacks.
In this work, we propose a new version of the two-party Dilithium signature scheme. The security of our scheme is based on the hardness of Module-LWE and Module-SIS problems. In our construction, we follow a similar logic as Damgård et al. (PKC 2021) and use an additively homomorphic commitment scheme. However, compared to them, our protocol uses signature compression techniques from the original Dilithium signature scheme which makes it closer to the version submitted to the NIST PQC competition. We focus on two-party signature schemes in the context of user authentication.

Is ML-Based Cryptanalysis Inherently Limited? Simulating Cryptographic Adversaries via Gradient-Based Methods

Given the recent progress in machine learning (ML), the cryptography community has started exploring the applicability of ML methods to the design of new cryptanalytic approaches. While current empirical results show promise, the extent to which such methods may outperform classical cryptanalytic approaches is still somewhat unclear.
In this work, we initiate exploration of the theory of ML-based cryptanalytic techniques, in particular providing new results towards understanding whether they are fundamentally limited compared to traditional approaches. Whereas most classic cryptanalysis crucially relies on directly processing individual samples (e.g., plaintext-ciphertext pairs), modern ML methods thus far only interact with samples via gradient-based computations that average a loss function over all samples. It is, therefore, conceivable that such gradient-based methods are inherently weaker than classical approaches.
We introduce a unifying framework for capturing both ``sample-based'' adversaries that are provided with direct access to individual samples and ``gradient-based'' ones that are restricted to issuing gradient-based queries that are averaged over all given samples via a loss function. Within our framework, we establish a general feasibility result showing that any sample-based adversary can be simulated by a seemingly-weaker gradient-based one. Moreover, the simulation exhibits a nearly optimal overhead in terms of the gradient-based simulator's running time. Finally, we extend and refine our simulation technique to construct a gradient-based simulator that is fully parallelizable (crucial for avoiding an undesirable overhead for parallelizable cryptanalytic tasks), which is then used to construct a gradient-based simulator that executes the particular and highly useful gradient-descent method.
Taken together, although the extent to which ML methods may outperform classical cryptanalytic approaches is still somewhat unclear, our results indicate that such gradient-based methods are not inherently limited by their seemingly restricted access to the provided samples.

Revisiting PACD-based Attacks on RSA-CRT

In this work, we use some recent developments in lattice-based cryptanalytic tools to revisit a fault attack on RSA-CRT signatures based on the Partial Approximate Common Divisor (PACD) problem. By reducing the PACD to a Hidden Number Problem (HNP) instance, we decrease the number of required faulted bits from 32 to 7 in the case of a 1024-bit RSA. We successfully apply the attack to RSA instances up to 8192-bit and present an enhanced analysis of the error-tolerance in the Bounded Distance Decoding (BDD) with predicate approach. Finally, evaluating the impact of standard side-channel and fault countermeasures, we show that merely verifying the signature before output is not an adequate protection against this attack. The reduction from PACD to HNP might be of independent interest.

Oryx: Private detection of cycles in federated graphs

This paper proposes Oryx, a system for efficiently detecting cycles in federated graphs where parts of the graph are held by different parties and are private. Cycle detection is an important building block in designing fraud detection algorithms that operate on confidential transaction data held by different financial institutions. Oryx allows detecting cycles of various length while keeping the topology of the graphs secret, and it does so efficiently; Oryx achieves quasilinear computational complexity and scales well with more machines thanks to a parallel design. Our implementation of Oryx running on a single 32-core AWS machine (for each party) can detect cycles of up to length 6 in under 5 hours in a financial transaction graph that consists of tens of millions of nodes and edges. While the costs are high, adding more machines further reduces the completion time. Furthermore, Oryx is, to our knowledge, the first and only system that can handle this task.

DeCAF: Decentralizable Continuous Group Key Agreement with Fast Healing

Continuous group key agreement (CGKA) allows a group of users to maintain a continuously updated shared key in an asynchronous setting where parties only come online sporadically and their messages are relayed by an untrusted server. CGKA captures the basic primitive underlying group messaging schemes.
Current solutions including TreeKEM ("Messaging Layer Security'' (MLS) IETF RFC 9420) cannot handle concurrent requests while retaining low communication complexity. The exception being CoCoA, which is concurrent while having extremely low communication complexity (in groups of size $n$ and for $m$ concurrent updates the communication per user is $\log(n)$, i.e., independent of $m$). The main downside of CoCoA is that in groups of size $n$, users might have to do up to $\log(n)$ update requests to the server to ensure their (potentially corrupted) key material has been refreshed.
In this work we present a "fast healing'' concurrent CGKA protocol, named DeCAF, where users will heal after at most $\log(t)$ requests, with $t$ being the number of corrupted users. While also suitable for the standard central-server setting, our protocol is particularly interesting for realizing decentralized group messaging, where protocol messages (add, remove, update) are being posted on some append-only data structure rather than sent to a server. In this setting, concurrency is crucial once the rate of requests exceeds, say, the rate at which new blocks are added to a blockchain.
In the central-server setting, CoCoA (the only alternative with concurrency, sub-linear communication and basic post-compromise security) enjoys much lower download communication. However, in the decentralized setting - where there is no server which can craft specific messages for different users to reduce their download communication - our protocol significantly outperforms CoCoA. DeCAF heals in fewer rounds ($\log(t)$ vs. $\log(n)$) while incurring a similar per round per user communication cost.