Papers updated in last 31 days (297 results)
Exploring the Six Worlds of Gröbner Basis Cryptanalysis: Application to Anemoi
Gröbner basis cryptanalysis of hash functions and ciphers, and their underlying permutations, has seen renewed interest recently. Anemoi (Crypto'23) is a permutation-based hash function that is efficient for a variety of arithmetizations used in zero-knowledge proofs. In this paper, exploring both theoretical bounds as well as experimental validation, we present new complexity estimates for Gröbner basis attacks on the Anemoi permutation over prime fields.
We cast our findings in what we call the six worlds of Gröbner basis cryptanalysis. As an example, keeping the same security arguments of the design, we conclude that at least 41 instead of 37 rounds would need to be used for 256-bit security, whereby our suggestion does not yet include a security margin.
Another Lattice Attack Against an RSA-like Cryptosystem
Let $N=pq$ be the product of two balanced prime numbers $p$ and $q$. In 2015, Roman'kov introduced an interesting RSA-like cryptosystem that, unlike the classical RSA key equation $ed - k (p-1)(q-1) = 1$, uses the key equation $ed - k r = 1$, where $r | p-1$ and is a large prime number. In this paper, we study if small private key attacks based on lattices can be applied to Roman'kov's cryptosystem. More precisely, we argue that such attacks do not appear to be applicable to this scheme without substantial adaptations.
Revisiting Leakage-Resilient MACs and Succinctly-Committing AEAD: More Applications of Pseudo-Random Injections
Pseudo-Random Injections (PRIs) have been used in several applications in symmetric-key cryptography, such as in the idealization of Authenticated Encryption with Associated Data (AEAD) schemes, building robust AEAD, and, recently, in converting a committing AEAD scheme into a succinctly committing AEAD scheme. In Crypto 2024, Bellare and Hoang showed that if an AEAD scheme is already committing, it can be transformed into a succinctly committing scheme by encrypting part of the plaintext using a PRI. In this paper, we revisit the applications of PRIs in building Message Authentication Codes (MACs) and AEAD schemes. First, we look at some of the properties and definitions of PRIs, such as collision resistance and unforgeability when used as a MAC with small plaintext space, under different leakage models. Next, we show how they can be combined with collision-resistant hash functions to build a MAC for long plaintexts, offering flexible security depending on how the PRI and equality check are implemented. If both the PRI and equality check are leak-free, the MAC provides almost optimal security, but the security only degrades a little if the equality check is only leakage-resilient (rather than leak-free). If the equality check has unbounded leakage, the security drops to a baseline security, rather than being completely insecure. Next, we show how to use PRIs to build a succinctly committing online AEAD scheme dubbed as scoAEfrom scratch that achieves succinct CMT-4 security, privacy, and Ciphertext Integrity with Misuse and Leakage (CIML) security. Last but not least, we show how to build a succinct nonce Misuse-Resistant (MRAE) AEAD scheme, dubbed as scMRAE. The construction combines the SIV paradigm with PRI-based encryption (e.g. the Encode-then-Encipher (EtE) framework).
Light Clients for Lazy Blockchains
Lazy blockchains decouple consensus from transaction verification and execution to increase throughput. Although they can contain invalid transactions (e.g., double spends) as a result, these can easily be filtered out by full nodes that check if there have been previous conflicting transactions. However, creating light (SPV) clients that do not see the whole transaction history becomes a challenge: A record of a transaction on the chain does not necessarily entail transaction confirmation. In this paper, we devise a protocol that enables the creation of efficient light clients for lazy blockchains. The number of interaction rounds and the communication complexity of our protocol are logarithmic in the blockchain execution time. Our construction is based on a bisection game that traverses the Merkle tree containing the ledger of all - valid or invalid - transactions. We prove that our proof system is succinct, complete and sound, and empirically demonstrate the feasibility of our scheme.
Discrete Gaussians Modulo Sub-Lattices: New Leftover Hash Lemmas for Discrete Gaussians
The Leftover Hash Lemma (LHL) is a powerful tool for extracting randomness from an entropic distribution, with numerous applications in cryptography. LHLs for discrete Gaussians have been explored in both integer settings by Gentry et al. (GPV, STOC'08) and algebraic ring settings by Lyubashevsky et al. (LPR, Eurocrypt'13). However, the existing LHLs for discrete Gaussians have two main limitations: they require the Gaussian parameter to be larger than certain smoothing parameters, and they cannot handle cases where fixed and arbitrary information is leaked.
In this work, we present new LHLs for discrete Gaussians in both integer and ring settings. Our results show that the Gaussian parameter can be improved by a factor of $\omega(\sqrt{\log\lambda})$ and $O(\sqrt{N})$ compared to the regularity lemmas of GPV and LPR, respectively, under similar parameter choices such as the dimension and ring. Furthermore, our new LHLs can be applied to leaked discrete Gaussians, and the result can be used to establish asymptotic hardness of the extended MLWE assumptions, addressing an open question in recent works by Lyubashevsky et al. (LNP, Crypto'22). Our central techniques involve new fine-grained analyses of the min-entropy in discrete Gaussians modulo sublattices and should be of interest.
Finding Practical Parameters for Isogeny-based Cryptography
Isogeny-based schemes often come with special requirements on the field of definition of the involved elliptic curves. For instance, the efficiency of SQIsign, a promising candidate in the NIST signature standardisation process, requires a large power of two and a large smooth integer $T$ to divide $p^2-1$ for its prime parameter $p$.
We present two new methods that combine previous techniques for finding suitable primes: sieve-and-boost and XGCD-and-boost. We use these methods to find primes for the NIST submission of SQIsign. Furthermore, we show that our methods are flexible and can be adapted to find suitable parameters for other isogeny-based schemes such as AprèsSQI or POKE. For all three schemes, the parameters we present offer the best performance among all parameters proposed in the literature.
Another Walk for Monchi
Monchi is a new protocol aimed at privacy-preserving biometric identification. It begins with scores computation in the encrypted domain thanks to homomorphic encryption and ends with comparisons of these scores to a given threshold with function secret sharing. We here study the integration in that context of scores computation techniques recently introduced by Bassit et al. that eliminate homomorphic multiplications by replacing them by lookup tables. First, we extend this lookup tables biometric recognition solution by adding the use of function secret sharing for the final comparison of scores. Then, we introduce a two-party computation of the scores with lookup tables which fits nicely together with the function secret sharing scores comparison. Our solutions accommodate well with the flight boarding use case introduced by Monchi.
A Tool for Fast and Secure LWE Parameter Selection: the FHE case
The field of fully homomorphic encryption (FHE) has seen many theoretical and computational advances in recent years, bringing the technology closer to practicality than ever before. For this reason, practitioners in related fields, such as machine learning, are increasingly interested in using FHE to provide privacy to their applications.
Despite this progress, selecting secure and efficient parameters for FHE remains a complex and challenging task due to the intricate interdependencies between parameters. In this work, we address this issue by providing a rigorous theoretical foundation for parameter selection for any LWE-based schemes, with a specific focus on FHE. Our approach starts with an in-depth analysis of lattice attacks on the LWE problem, deriving precise expressions for the most effective ones. Building on this, we introduce closed-form formulas that establish the relationships among the LWE parameters.
In addition, we introduce a numerical method to enable the accurate selection of any configurable parameter to meet a desired security level.
Finally, we use our results to build a practical and efficient tool for researchers and practitioners deploying FHE in real-world applications, ensuring that our approach is both rigorous and accessible.
Shardora: Towards Scaling Blockchain Sharding via Unleashing Parallelism
Sharding emerges as a promising solution to enhance blockchain scalability. However, it faces two critical limitations during shard reconfiguration: (1) the TPS-Degradation issue, arising from ledger synchronization conflicts during transaction processing, and (2) the Zero-TPS issue, caused by disruptions in transaction processing due to key negotiation. To this end, we propose Shardora, a blockchain sharding system for scaling blockchain by unleashing parallelism. In Shardora, we implement two essential mechanisms: (1) A parallelized dual committee framework with a reputation mechanism to mitigate the TPS-Degradation issue while ensuring system security. (2) A parallelized key pre-negotiation mechanism with a secret-reuse strategy to avoid the Zero-TPS issue while maintaining a continuously high TPS. We prove that Shardora offers theory-guaranteed security. We implement a prototype of Shardora and deploy it on Alibaba Cloud. Experimental results demonstrate that Shardora addresses the limitations by significantly reducing the overhead of both ledger synchronization and key negotiation, which outperforms state-of-the-art sharding schemes by at least 90%. In addition, Shardora shows its superior performance in terms of throughput and latency, achieving a peak throughput of 8300 TPS on a single shard with 600 nodes under LAN conditions. The code of Shardora is publicly available on GitHub.
PerfOMR: Oblivious Message Retrieval with Reduced Communication and Computation
Anonymous message delivery, as in privacy-preserving blockchain and private messaging applications, needs to protect recipient metadata: eavesdroppers should not be able to link messages to their recipients. This raises the question: how can untrusted servers assist in delivering the pertinent messages to each recipient, without learning which messages are addressed to whom?
Recent work constructed Oblivious Message Retrieval (OMR) protocols that outsource the message detection and retrieval in a privacy-preserving way, using homomorphic encryption. Their construction exhibits significant costs in computation per message scanned (${\sim}0.1$ second), as well as in the size of the associated messages (${\sim}1$kB overhead) and public keys (${\sim}132$kB).
This work constructs more efficient OMR schemes, by replacing the LWE-based clue encryption of prior works with a Ring-LWE variant, and utilizing the resulting flexibility to improve several components of the scheme. We thus devise, analyze, and benchmark two protocols:
The first protocol focuses on improving the detector runtime, using a new retrieval circuit that can be homomorphically evaluated $13.8$x faster than the prior work.
The second protocol focuses on reducing the communication costs, by designing a different homomorphic decryption circuit that allows the parameter of the Ring-LWE encryption to be set such that the public key size is about $235$x smaller than the prior work, and the message size is roughly $1.6$x smaller. The runtime of this second construction is ${\sim}40.0$ms per message, still more than $2.5$x faster than prior works.
A non-comparison oblivious sort and its application to private k-NN
This paper introduces a novel adaptation of counting sort that enables sorting of encrypted data using Fully Homomorphic Encryption (FHE). Our approach represents the first known sorting algorithm for encrypted data that does not rely on comparisons. The implementation leverages some basic operations on TFHE's Look-Up-Tables (LUT). We have integrated these operations into RevoLUT, a comprehensive open-source library built upon tfhe-rs, which can be of independent interest for oblivious algorithms. We demonstrate the effectiveness of our Blind Counting Sort algorithm by developing a top-$k$ selection algorithm and applying it to privacy-preserving $k$-Nearest Neighbors classification. This proves to be approximately 5x faster than current state-of-the-art methods.
A notion on S-boxes for a partial resistance to some integral attacks
In two recent papers, we introduced and studied the notion of $k$th-order sum-freedom of a vectorial function $F:\mathbb F_2^n\to \mathbb F_2^m$. This notion generalizes that of almost perfect nonlinearity (which corresponds to $k=2$) and has some relation with the resistance to integral attacks of those block ciphers using $F$ as a substitution box (S-box), by preventing the propagation of the division property of $k$-dimensional affine spaces. In the present paper, we show that this notion, which is rarely satisfied by vectorial functions, can be weakened while retaining the property that the S-boxes do not propagate the division property of $k$-dimensional affine spaces. This leads us to the property that we name $k$th-order $t$-degree-sum-freedom, whose strength decreases when $t$ increases, and which coincides with $k$th-order sum-freedom when $t=1$. The condition for $k$th-order $t$-degree-sum-freedom is that, for every $k$-dimensional affine space $A$, there exists a non-negative integer $j$ of 2-weight at most $t$ such that $\sum_{x\in A}(F(x))^j\neq 0$. We show, for a general $k$th-order $t$-degree-sum-free function $F$, that $t$ can always be taken smaller than or equal to $\min(k,m)$ under some reasonable condition on $F$, and that it is larger than or equal to $\frac k{\deg(F)}$, where $\deg(F)$ is the algebraic degree of $F$. We study examples for $k=2$ (case in which $t=1$ corresponds to APNness) showing that finding $j$ of 2-weight 2 can be challenging, and we begin the study of power functions, and in particular, of the multiplicative inverse function (used as S-box in the AES), for which we extend to $k$th-order $t$-degree-sum-freedom the result that it is $k$th-order sum-free if and only if it is $(n-k)$th-order sum-free. We begin the study of the cases of $k\in \{2,3,n-3,n-2,n-1,n\}$.
Preservation of Speculative Constant-Time by Compilation
Compilers often weaken or even discard software-based countermeasures commonly used to protect programs against side-channel attacks; worse, they may also introduce vulnerabilities that attackers can exploit. The solution to this problem is to develop compilers that preserve such countermeasures. Prior work establishes that (a mildly modified version of) the CompCert and Jasmin formally verified compilers preserve constant-time, an information flow policy that ensures that programs are protected against timing side-channel attacks. However, nothing is known about preservation of speculative constant-time, a strengthening of the constant-time policy that ensures that programs are protected against Spectre-v1 attacks. We first show that preservation of speculative constant-time fails in practice by providing examples of secure programs whose compilation is not speculative constant-time using GCC (GCC -O0 and GCC -O1) and Jasmin. Then, we define a proof-of-concept compiler that distills some of the critical passes of the Jasmin compiler and use the Coq proof assistant to prove that it preserves speculative constant-time. Finally, we patch the Jasmin speculative constant-time type checker and demonstrate that all cryptographic implementations written in Jasmin can be fixed with minimal impact.
Extended Diffie-Hellman Encryption for Secure and Efficient Real-Time Beacon Notifications
Every computing paradigm involving communication requires new security protocols employing cryptography. For example, the Internet gave rise to TLS/SSL, and Mobile Computing gave rise to End to End Encryption protocols. In this paper, we address an emerging IoT paradigm involving beacons attached to things and security protocols associated with this new configuration.
Specifically, we address the ``beacon notification problem,'' a critical IoT paradigm aims at providing secure and efficient real-time notifications from beacons to their owners. Since the beacon notification problem has not yet been formally defined, we begin by inspecting natural requirements based on the operational setting and establishing correctness, security, and privacy definitions through the use of cryptographic games.
To resolve this problem, we propose a novel cryptographic tool we call XDHIES, which is a considerable extension of available Diffie-Hellman encryption schemes. We then show a new notification protocol built upon XDHIES and we prove that this cryptographic protocol is secure and private and successfully meets all the above problem's requirements.
WHIR: Reed–Solomon Proximity Testing with Super-Fast Verification
We introduce WHIR, a new IOP of proximity that offers small query complexity and exceptionally fast verification time. The WHIR verifier typically runs in a few hundred microseconds, whereas other verifiers in the literature require several milliseconds (if not much more). This significantly improves the state of the art in verifier time for hash-based SNARGs (and beyond).
Crucially, WHIR is an IOP of proximity for constrained Reed–Solomon codes, which can express a rich class of queries to multilinear polynomials and to univariate polynomials. In particular, WHIR serves as a direct replacement for protocols like FRI, STIR, BaseFold, and others. Leveraging the rich queries supported by WHIR and a new compiler for multilinear polynomial IOPs, we obtain a highly efficient SNARG for generalized R1CS.
As a comparison point, our techniques also yield state-of-the-art constructions of hash-based (non-interactive) polynomial commitment schemes for both univariate and multivariate polynomials (since sumcheck queries naturally express polynomial evaluations). For example, if we use WHIR to construct a polynomial commitment scheme for degree 222, with 100 bits of security, then the time to commit and open is 1.2 seconds, the sender communicates 63 KiB to the receiver, and the opening verification time is 360 microseconds.
High Speed High Assurance implementations of Mutivariate Quadratic based Signatures
In this poster, we present a Jasmin implementation of Mayo2, a multivariate quadratic(MQ) based signature scheme. Mayo overcomes the disadvantage of the Unbalanced oil and vinegar(UOV) scheme by whipping the UOV map to produce public keys of sizes comparable to ML-DSA. Our Jasmin implementation of Mayo2 takes 930 μs for key-gen, 3206 μs for sign, 480 μs for verify based on the average of 1,00,000 runs of the implementation on a 2.25GHz x86 64 processor with 256 GB RAM. To this end, we have a multivariate quadratic based signature implementation that is amenable for verification of constant-time, correctness, proof of equivalence properties using Easycrypt. Subsequently, the results of this endeavor can be extended for other MQ based schemes including UOV.
A Comprehensive Survey on Hardware-Software co-Protection against Invasive, Non-Invasive and Interactive Security Threats
In the face of escalating security threats in modern
computing systems, there is an urgent need for comprehensive
defense mechanisms that can effectively mitigate invasive, noninvasive and interactive security vulnerabilities in hardware
and software domains. Individually, hardware and software
weaknesses and probable remedies have been practiced but
protecting a combined system has not yet been discussed in
detail. This survey paper provides a comprehensive overview of
the emerging field of Hardware-Software co-Protection against
Invasive and Non-Invasive Security Threats. We systematically
review state-of-the-art research and developments in hardware
and software security techniques, focusing on their integration to
create synergistic defense mechanisms. The survey covers a wide
range of security threats, including physical attacks, side-channel
attacks, and malware exploits, and explores the diverse strategies
employed to counter them. Our survey meticulously examines the
landscape of security vulnerabilities, encompassing both physical
and software-based attack vectors, and explores the intricate
interplay between hardware and software defenses in mitigating
these threats. Furthermore, we discuss the challenges and opportunities associated with Hardware-Software co-Protection and
identify future research directions to advance the field. Through
this survey, we aim to provide researchers, practitioners, and
policymakers with valuable insights into the latest advancements
and best practices for defending against complex security threats
in modern computing environments.
Sailfish: Towards Improving the Latency of DAG-based BFT
Directed Acyclic Graph (DAG) based BFT protocols balance consensus efforts across different parties and maintain high throughput even when some designated parties fail. However, existing DAG-based BFT protocols exhibit long latency to commit decisions, primarily because they have a \emph{leader} every 2 or more ``rounds''. Recent works, such as Shoal (FC'23) and Mysticeti, have deemed supporting a leader vertex in each round particularly difficult, if not impossible. Consequently, even under honest leaders, these protocols require high latency (or communication complexity) to commit the proposal submitted by the leader (leader vertex) and additional latency to commit other proposals (non-leader vertices).
In this work, we present Sailfish, the first DAG-based BFT that supports a leader vertex in each round. Under honest leaders, Sailfish maintains a commit latency of one reliable broadcast (RBC) round plus $1\delta$ to commit the leader vertex (where $\delta$ is the actual transmission latency of a message) and only an additional RBC round to commit non-leader vertices. We also extend Sailfish to Multi-leader Sailfish, which facilitates multiple leaders within a single round and commits all leader vertices in a round with a latency of one RBC round plus $1\delta$. Our experimental evaluation demonstrates that our protocols introduce significantly lower latency overhead compared to existing DAG-based protocols, with similar throughput.
Inflation-Tracking Proof-of-Work Crypto-Currencies
We show that Bitcoin and other existing egalitarian crypto-currencies are unstable as store-of-value as they fail to track inflation of local currencies closely, and the price dynamic is purely driven by speculation. In the case of Bitcoin, we show that instead of price being based on cost of mining Bitcoin, it is the cost of mining that rapidly converges to the current price of Bitcoin. Based on rational expectations equilibrium, we argue that if the coins awarded during mining are increased in proportion to increase in difficulty of the underlying cryptographic puzzle, then the price of the coin is likely to track inflation of local currencies closely over medium to long term. However, since Moore's law as well as targeted hardware design can lead to computational cost deflation, we suggest a hyper-geometric tapering, instead of a geometric tapering, of the mining award over time. This also handles bootstrapping interest in the crypto-currency.
Shifting our knowledge of MQ-Sign security
Unbalanced Oil and Vinegar (UOV) is one of the oldest, simplest, and most studied ad-hoc multivariate signature schemes. UOV signature schemes are attractive because they have very small signatures and fast verification. On the downside, they have large public and secret keys. As a result, variations of the traditional UOV scheme are usually developed with the goal to reduce the key sizes. Seven variants of UOV were submitted to the additional call for digital signatures by NIST, prior to which, a variant named MQ-Sign was submitted to the (South) Korean post-quantum cryptography competition (KpqC). MQ-Sign is currently competing in the second round of KpqC with two variants. One of the variants corresponds to the classic description of UOV with certain implementation and parameter choices. In the other variant, called MQ-Sign-LR, a part of the central map is constructed from row shifts of a single matrix. This design makes for smaller secret keys, and in the case where the equivalent keys optimization is used, it also leads to smaller public keys. However, we show in this work that the polynomial systems arising from an algebraic attack have a specific structure that can be exploited. Specifically, we are able to find preimages for $d$-periodic targets under the public map with a probability of $63\%$ for all security levels. The complexity of finding these preimages, as well as the fraction of $d$-periodic target increases with $d$ and hence provides a trade-off. We show that for all security levels one can choose $d=\frac{v}{2}$, for $v$ the number of vinegar variables, and reduce the security claim. Our experiments show practical running times for lower $d$ ranging from 0.06 seconds to 32 hours.
Design and Implementation of a Fast, Platform-Adaptive, AIS-20/31 Compliant PLL-Based True Random Number Generator on a Zynq 7020 SoC FPGA
Phase-locked loops (PLLs) integrated within field-programmable gate arrays (FPGAs) or System-on-Chip FPGAs (SoCs) represent a promising approach for generating random numbers. Their widespread deployment, isolated functionality within these devices, and robust entropy, as demonstrated in prior studies, position PLL-based true random number generators (PLL-TRNGs) as highly viable solutions for this purpose. This study explicitly examines PLL-TRNG implementations using the ZC702 Rev1.1 Evaluation Board featuring the Zynq 7020 SoC from Xilinx, utilizing a configuration involving three such boards for experimental validation. Parameters governing the PLL-TRNG are optimized using a backtracking algorithm. Additionally, a novel methodology is proposed to enhance the rate of random data bit generation while preserving entropy characteristics. Performance metrics are rigorously evaluated against the criteria set by the German Federal Office for Information Security (BSI) AIS-20/31 Tests, accompanied by detailed descriptions of the implementation process. Furthermore, the suitability of our PLL-TRNG designs, attributed to their low resource utilization, is demonstrated.
Succinctly Verifiable Computation over Additively-Homomorphically Encrypted Data with Applications to Privacy-Preserving Blueprints
With additively homomorphic encryption (AHE), one can compute, from input ciphertexts $\mathsf{Enc}(x_1),\ldots,\mathsf{Enc}(x_n)$, and additional inputs $y_1,\ldots,y_k$, a ciphertext $c_\textit{f}=\mathsf{Enc}(f(x_1,\ldots,x_n,y_1,\ldots, y_k))$ for any polynomial $f$ in which each monomial has total degree at most $1$ in the $x$-variables (but can be arbitrary in the $y$-variables). For AHE that satisfies a set of natural requirements, we give a non-interactive zero-knowledge proof system (in the random-oracle model) for showing that a ciphertext $c_\textit{f}$ is the result of homomorphically evaluating $f$ on ciphertexts $c_1,\ldots,c_n$ and private inputs $y_1,\ldots,y_k$ that correspond to commitments $C_1,\ldots,C_k$. Our proofs are $\textit{succinct}$, i.e., their size is independent of the number of ciphertexts $n$, and is instead $O(k\log d)$ where $k$ is the number of private inputs, and $d$ is the maximum degree of any variable in $f$.
We give two ways of instantiating this framework: with ElGamal-based encryption (under the DDH assumption) and with a variant of the Camenisch-Shoup cryptosystem (under the DCR assumption). Both yield proof systems where computing and verifying the proof takes a comparable amount of time to homomorphically evaluating $f$.
Next, we show that our framework yields a dramatically improved privacy-preserving blueprint (PPB) system. Introduced by Kohlweiss, Lysyanskaya, and Nguyen (Eurocrypt'23), an $f$-PPB system allows an auditor with secret input $x$ to create a public encoding $\sf pk$ of the function $f(x,\cdot)$ that reveals nothing about $x$.
Yet, it allows a user to compute an encoding, or escrow $Z$, of the value $f(x,y)$ on input the user's private data $y$ corresponding to a commitment $C_y$; $Z$ will verifiably correspond to the commitment $C_y$. The auditor will be able to recover $f(x,y)$ from $Z$, but will learn no other information about $y$. For example, if $f$ is the watchlist function where $f(x,y)$ outputs $y$ only in the event that $y$ is on the list $x$, then an $f$-PPB allows the auditor to trace watchlisted users in an otherwise anonymous system.
Using our succinct zero-knowledge proof system for additively homomorphic computation we achieve the following results: (1) We provide efficient schemes for a bigger class of functions $f$; for example, we show how to realize $f$ that would allow the auditor to trace e-cash transactions of a criminal suspect which was previously not efficient. (2) For the watchlist and related functions, we reduce the size of the escrow $Z$ from linear in the size of the auditor's input $x$, to logarithmic.
Additionally, we define and satisfy a stronger notion of security for $f$-PPBs, where a malicious auditor cannot frame a user in a transaction in which the user was not involved in.
Efficient Modular Multiplication Hardware for Number Theoretic Transform on FPGA
In this paper, we present a comprehensive analysis of various modular multiplication methods for Number Theoretic Transform (NTT) on FPGA. NTT is a critical and time-intensive component of Fully Homomorphic Encryption (FHE) applications while modular multiplication consumes a significant portion of the design resources in an NTT implementation. We study the existing modular reduction approaches from the literature, and implement particular methods on FPGA. Specifically Word-Level Montgomery (WLM)) for NTT friendly primes [1] and K2RED [2]. For improvements, we explore the trade-offs between the number of available primes in special forms and hardware cost of the reduction methods. We develop a DSP multiplication-optimized version of WLM, which we call WLM-Mixed. We also introduce a subclass of Proth primes, referred to as Proth-l primes, characterized by a low and fixed signed Hamming Weight. This special class of primes allows us to design multiplication-free shift-add versions of K2RED and naive Montgomery reduction [3], referred to as K2RED-Shift and Montgomery-Shift. We provide in-depth evaluations of these five reduction methods in an NTT architecture on FPGA. Our results indicate that WLM-Mixed is highly resource-efficient, utilizing only 3 DSP multiplications for 64-bit coefficient moduli. On the other hand, K2RED-Shift and Montgomery-Shift offer DSP-free alternatives, which can be beneficial in specific scenarios
IO-Optimized Design-Time Configurable Negacyclic Seven-Step NTT Architecture for FHE Applications
Uncategorized
Uncategorized
FHE enables computations on encrypted data, making it essential for privacy-preserving applications. However, it involves computationally demanding tasks, such as polynomial multiplication, while NTT is the state-of-the-art solution to perform this task. Most FHE schemes operate over the negacyclic ring of polynomials. We introduce a novel formulation of the hierarchical Four-Step NTT approach for the negacyclic ring, eliminating the need for pre- and post-processing steps found in the existing methods. To accelerate NTT operations, the FPGAs offer flexible and powerful computing platforms. We propose an FPGA-based, parametric and fully pipelined architecture that implements the improved Seven-Step NTT algorithm (which builds upon the four-step). Our design supports a wide range of parameters, including ring sizes up to $2^{16}$ and modulus sizes up to $64$-bit. We focus on achieving configurable throughput, as constrained by the bandwidth of HBM bandwidth, and aim to maximize throughput through an IO parametric design on the Alveo U280 FPGA. The implementation results demonstrate a reduction in the area-time-product by $2.08\times$ and a speed-up of $10.32\times$ for a ring size of $2^{16}$ and a 32-bit width compared to the current state-of-the-art designs.
Chosen-Prefix Collisions on AES-like Hashing
Chosen-prefix collision (CPC) attack was first presented by Stevens, Lenstra and de Weger on MD5 at Eurocrypt 2007. A CPC attack finds a collision for any two chosen prefixes, which is a stronger variant of collision attack. CPCs are naturally harder to construct but have larger practical impact than (identical-prefix) collisions, as seen from the series of previous works on MD5 by Stevens et al. and SHA-1 by Leurent and Peyrin. Despite its significance, the resistance of CPC attacks has not been studied on AES-like hashing.
In this work, we explore CPC attacks on AES-like hashing following the framework practiced on MD5 and SHA-1. Instead of the message modification technique developed for MD-SHA family, we opt for related-key rebound attack to construct collisions for AES-like hashing in view of its effectiveness. We also note that the CPC attack framework can be exploited to convert a specific class of one-block free-start collisions into two-block collisions, which sheds light on the importance of free-start collisions. As a result, we present the first CPC attacks on reduced Whirlpool, Saturnin-hash and AES-MMO/MP in classic and quantum settings, and extend the collision attack on Saturnin-hash from 5 to 6 rounds in the classic setting. As an independent contribution, we improve the memoryless algorithm of solving 3-round inbound phase by Hosoyamada and Sasaki at Eurocrpyt 2020, which leads to improved quantum attacks on Whirlpool. Notably, we find the first 6-round memoryless quantum collision attack on Whirlpool better than generic CNS collision finding algorithm when exponential-size qRAM is not available but exponential-size classic memory is available.
Authenticated private information retrieval
This paper introduces protocols for authenticated private information retrieval. These schemes enable a client to fetch a record from a remote database server such that (a) the server does not learn which record the client reads, and (b) the client either obtains the "authentic" record or detects server misbehavior and safely aborts. Both properties are crucial for many applications. Standard private-information-retrieval schemes either do not ensure this form of output authenticity, or they require multiple database replicas with an honest majority. In contrast, we offer multi-server schemes that protect security as long as at least one server is honest. Moreover, if the client can obtain a short digest of the database out of band, then our schemes require only a single server. Performing an authenticated private PGP-public-key lookup on an OpenPGP key server's database of 3.5 million keys (3 GiB), using two non-colluding servers, takes under 1.2 core-seconds of computation, essentially matching the time taken by unauthenticated private information retrieval. Our authenticated single-server schemes are 30-100$\times$ more costly than state-of-the-art unauthenticated single-server schemes, though they achieve incomparably stronger integrity properties.
Differential MITM attacks on SLIM and LBCIoT
SLIM and LBCIoT are lightweight block ciphers proposed for IoT applications. We present differential meet-in-the-middle attacks on these ciphers and discuss several implementation variants and possible improvements of these attacks. Experimental validation also shows some results that may be of independent interest in the cryptanalysis of other ciphers. Namely, the problems with low-probability differentials and the questionable accuracy of standard complexity estimates of using filters.
Partially Non-Interactive Two-Round Lattice-Based Threshold Signatures
This paper gives the first lattice-based two-round threshold signature based on lattice assumptions for which the first message is independent of the message being signed without relying on fully-homomorphic encryption, and our construction supports arbitrary thresholds.
Our construction provides a careful instantiation of a generic threshold signature construction by Tessaro and Zhu (EUROCRYPT ’23) based on specific linear hash functions, which in turns can be seen as a generalization of the FROST scheme by Komlo and Goldberg (SAC ’20). Our reduction techniques are new in the context of lattice-based cryptography. Also, our scheme does not use any heavy tools, such as NIZKs or homomorphic trapdoor commitments.
Verifying Jolt zkVM Lookup Semantics
Lookups are a popular way to express repeated constraints in state-of-the art SNARKs. This is especially the case for zero-knowledge virtual machines (zkVMs), which produce succinct proofs of correct execution for programs expressed as bytecode according to a specific instruction set architecture (ISA). The Jolt zkVM (Arun, Setty & Thaler, Eurocrypt 2024) for RISC-V ISA employs Lasso (Setty, Thaler & Wahby, Eurocrypt 2024), an efficient lookup argument for massive structured tables, to prove correct execution of instructions. Internally, Lasso performs multiple lookups into smaller subtables, then combines the results.
We present an approach to formally verify Lasso-style lookup arguments against the semantics of instruction set architectures. We demonstrate our approach by formalizing and verifying all Jolt 32-bit instructions corresponding to the RISC-V base instruction set (RV32I) using the ACL2 theorem proving system. Our formal ACL2 model has undergone extensive validation against the Rust implementation of Jolt. Due to ACL2's bit-blasting, rewriting, and developer-friendly features, our formalization is highly automated.
Through formalization, we also discovered optimizations to the Jolt codebase, leading to improved efficiency without impacting correctness or soundness. In particular, we removed one unnecessary lookup each for four instructions, and reduced the sizes of three subtables by 87.5\%.
Doubly Efficient Batched Private Information Retrieval
Private information retrieval (PIR) allows a client to read data from a server, without revealing which information they are interested in. A PIR is doubly efficient if the server runtime is, after a one-time pre-processing, sublinear in the database size. A recent breakthrough result from Lin, Mook, and Wichs [STOC’23] proposed the first-doubly efficient PIR with (online) server computation poly-logarithmic in the size of the database, assuming the hardness of the standard Ring-LWE problem.
In this work, we consider the problem of doubly efficient batched PIR (DEBPIR), where the client wishes to download multiple entries. This problem arises naturally in many practical applications of PIR, or when the database contains large entries. Our main result is a construction of DEBPIR where the amortized communication and server computation overhead is $\tilde{O}(1)$, from the Ring-LWE problem. This represents an exponential improvement compared with known constructions, and it is optimal up to poly-logarithmic factors in the security parameter. Interestingly, the server’s online operations are entirely combinatorial and all algebraic computations are done in the pre-processing or delegated to the client.
Do Not Disturb a Sleeping Falcon: Floating-Point Error Sensitivity of the Falcon Sampler and Its Consequences
Falcon is one of the three postquantum signature schemes already selected by NIST for standardization. It is the most compact among them, and offers excellent efficiency and security. However, it is based on a complex algorithm for lattice discrete Gaussian sampling which presents a number of implementation challenges. In particular, it relies on (possibly emulated) floating-point arithmetic, which is often regarded as a cause for concern, and has been leveraged in, e.g., side-channel analysis. The extent to which Falcon's use of floating point arithmetic can cause security issues has yet to be thoroughly explored in the literature.
In this paper, we contribute to filling this gap by identifying a way in which Falcon's lattice discrete Gaussian sampler, due to specific design choices, is singularly sensitive to floating-point errors. In the presence of small floating-point discrepancies (which can occur in various ways, including the use of the two almost but not quite equivalent signing procedures ``dynamic'' and ``tree'' exposed by the Falcon API), we find that, when called twice on the same input, the Falcon sampler has a small but significant chance (on the order of once in a few thousand calls) of outputting two different lattice points with a very structured difference, that immediately reveals the secret key. This is in contrast to other lattice Gaussian sampling algorithms like Peikert's sampler and Prest's hybrid sampler, that are stable with respect to small floating-point errors.
Correctly generated Falcon signatures include a salt that should in principle prevent the sampler to ever be called on the same input twice. In that sense, our observation has little impact on the security of Falcon signatures per se (beyond echoing warnings about the dangers of repeated randomness). On the other hand, it is critical for derandomized variants of Falcon, which have been proposed for use in numerous settings. One can mention in particular identity-based encryption, SNARK-friendly signatures, and sublinear signature aggregation. For all these settings, small floating point discrepancies have a chance of resulting in full private key exposure, even when using the slower, integer-based emulated floating-point arithmetic of Falcon's
reference implementation.
Impossibility Results for Post-Compromise Security in Real-World Communication Systems
Modern secure communication systems, such as iMessage, WhatsApp, and Signal include intricate mechanisms that aim to achieve very strong security properties. These mechanisms typically involve continuously merging in new fresh secrets into the keying material, which is used to encrypt messages during communications. In the literature, these mechanisms have been proven to achieve forms of Post Compromise Security (PCS): the ability to provide communication security even if the full state of a party was compromised some time in the past. However, recent work has shown these proofs do not transfer to the end-user level, possibly because of usability concerns. This has raised the question of whether end-users can actually obtain PCS or not, and under which conditions.
Here we show and formally prove that communication systems that need to be resilient against certain types of state loss (which can occur in practice) fundamentally cannot achieve full PCS for end-users. Whereas previous work showed that the Signal messenger did not achieve this with its current session-management layer, we isolate the exact conditions that cause this failure, and why this cannot be simply solved in communication systems by implementing a different session-management layer or an entirely different protocol. Moreover, we clarify the trade-off of the maximum number of sessions between two users (40 in Signal) in terms of failure-resilience versus security.
Our results have direct consequences for the design of future secure communication systems, and could motivate either the simplification of redundant mechanisms, or the improvement of session-management designs to provide better security trade-offs with respect to state loss/failure tolerance.
Improved PIR Schemes using Matching Vectors and Derivatives
In this paper, we construct new t-server Private Information Retrieval (PIR) schemes with communication complexity subpolynomial in the previously best known, for all but finitely many t. Our results are
based on combining derivatives (in the spirit of Woodruff-Yekhanin) with the Matching Vector based PIRs of Yekhanin and Efremenko. Previously such a combination was achieved in an ingenious way by Dvir and Gopi, using polynomials and derivatives over certain exotic rings, en route to their fundamental result giving the first 2-server PIR with subpolynomial communication.
Our improved PIRs are based on two ingredients:
• We develop a new and direct approach to combine derivatives with Matching Vector based PIRs.
This approach is much simpler than that of Dvir-Gopi: it works over the same field as the original PIRs, and only uses elementary properties of polynomials and derivatives.
• A key subproblem that arises in the above approach is a higher-order polynomial interpolation problem. We show how “sparse S-decoding polynomials”, a powerful tool from the original constructions of Matching Vector PIRs, can be used to solve this higher-order polynomial interpolation problem using surprisingly few higer-order evaluations.
Using the known sparse S-decoding polynomials in combination with our
ideas leads to our improved PIRs. Notably, we get a 3-server PIR scheme with communication $2^{O^\sim( (\log n)^{1/3}) }$, improving upon the previously best known communication of $2^{O^\sim( \sqrt{\log n})}$ due to Efremenko.
Ceno: Non-uniform, Segment and Parallel Zero-knowledge Virtual Machine
In this paper, we explore a novel Zero-knowledge Virtual Machine (zkVM) framework leveraging succinct, non-interactive zero-knowledge proofs for verifiable computation over any code. Our approach divides the proof of program execution into two stages. In the first stage, the process breaks down program execution into segments, identifying and grouping identical sections. These segments are then proved through data-parallel circuits that allow for varying amounts of duplication. In the subsequent stage, the verifier examines these segment proofs, reconstructing the program's control and data flow based on the segments' duplication number and the original program. The second stage can be further attested by a uniform recursive proof.
We propose two specific designs of this concept, where segmentation and parallelization occur at two levels: opcode and basic block. Both designs try to minimize the control flow that affects the circuit size and support dynamic copy numbers, ensuring that computational costs directly correlate with the actual code executed (i.e., you only pay as much as you use). In our second design, in particular, by proposing an innovative data-flow reconstruction technique in the second stage, we can drastically cut down on the stack operations even compared to the original program execution. Note that the two designs are complementary rather than mutually exclusive. Integrating both approaches in the same zkVM could unlock more significant potential to accommodate various program patterns.
We present an asymmetric GKR scheme to implement our designs, pairing a non-uniform prover and a uniform verifier to generate proofs for dynamic-length data-parallel circuits. The use of a GKR prover also significantly reduces the size of the commitment. GKR allows us to commit only the circuit's input and output, whereas in Plonkish-based solutions, the prover needs to commit to all the witnesses.
Age-aware Fairness in Blockchain Transaction Ordering for Reducing Tail Latency
In blockchain networks, transaction latency is crucial for determining the quality of service (QoS). The latency of a transaction is measured as the time between its issuance and its inclusion in a block in the chain. A block proposer often prioritizes transactions with higher fees or transactions from accounts it is associated with, to minimize their latencies. To maintain fairness among transactions, a block proposer is expected to select the included transactions randomly. The random selection might cause some transactions to experience high latency following the variance in the time a transaction waits until it is selected. We suggest an alternative, age-aware approach towards fairness so that transaction priority is increased upon observing a large waiting time. We explain that a challenge with this approach is that the age of a transaction is not absolute due to transaction propagation. Moreover, a node might present its transactions as older to obtain priority. We describe a new technique to enforce a fair block selection while prioritizing transactions that observed high latency. The technique is based on various declaration schemes in which a node declares its pending transactions, providing the ability to validate transaction age. By evaluating the solutions on Ethereum data and synthetic data of various scenarios, we demonstrate the advantages of the approach under realistic conditions and understand its potential impact to maintain fairness and reduce tail latency.
Push-Button Verification for BitVM Implementations
Bitcoin, while being the most prominent blockchain with the largest market capitalization, suffers from scalability and throughput limitations that impede the development of ecosystem projects like Bitcoin Decentralized Finance (BTCFi). Recent advancements in BitVM propose a promising Layer 2 (L2) solution to enhance Bitcoin's scalability by enabling complex computations off-chain with on-chain verification. However, Bitcoin's constrained programming environment—characterized by its non-Turing-complete Script language lacking loops and recursion, and strict block size limits—makes developing complex applications labor-intensive, error-prone, and necessitates manual partitioning of scripts. Under this complex programming model, subtle mistakes could lead to irreversible damage in a trustless environment like Bitcoin. Ensuring the correctness and security of such programs becomes paramount.
To address these challenges, we introduce the first formal verifier for BitVM implementations. Our approach involves designing a register-based, higher-level domain-specific language (DSL) that abstracts away complex stack operations, allowing developers to reason about program correctness more effectively while preserving the semantics of the low-level program. We present a formal computational model capturing the semantics of BitVM execution and Bitcoin script, providing a foundation for rigorous verification. To efficiently handle large programs and complex constraints arising from unrolled computations that simulate loops, we summarize repetitive "loop-style" computations using loop invariant predicates in our DSL. We leverage a counterexample-guided inductive synthesis (CEGIS) procedure to lift low-level Bitcoin script into our DSL, facilitating efficient verification without sacrificing accuracy. Evaluated on 78 benchmarks from BitVM implementations, our tool successfully verifies 83% of cases within 12.55 seconds on average and identified one previously unknown vulnerability, demonstrating its effectiveness in enhancing the security and reliability of BitVM.
Fine-Grained Non-Interactive Key-Exchange without Idealized Assumptions, and Lower Bounds
In this paper, we study multi-party non-interactive key exchange (NIKE) in the fine-grained setting. More precisely, we propose three multi-party NIKE schemes in three computation models, namely, the bounded parallel-time, bounded time, and bounded storage models. Their security is based on a very mild assumption (e.g., NC1 ⊊ ⊕L/poly) or even without any complexity assumption. This improves the recent work of Afshar, Couteau, Mahmoody, and Sadeghi (EUROCRYPT 2023) that requires idealized assumptions, such as random oracles or generic groups.
Additionally, we show that all our constructions satisfy a natural desirable property that we refer to as extendability, and we give generic transformations from extendable multi-party NIKE to multi-party identity-based NIKEs in the fine-grained settings.
Furthermore, we generalize the lower bound on users’ storage consumption in the bounded storage model by Dziembowski and Maurer (Eurocrypt 2004) to encompass any multi-party NIKE with extendability. This new lower bound suggests that the users’ storage consumption of our multi-party NIKE in the bounded storage model is optimal.
A Fault Analysis on SNOVA
SNOVA is a post-quantum cryptographic signature scheme known for its efficiency and compact key sizes, making it a second-round candidate in the NIST post-quantum cryptography standardization process. This paper presents a comprehensive fault analysis of SNOVA, focusing on both permanent and transient faults during signature generation. We introduce several fault injection strategies that exploit SNOVA's structure to recover partial or complete secret keys with limited faulty signatures. Our analysis reveals that as few as $22$ to $68$ faulty signatures, depending on the security level, can suffice for key recovery. We propose a novel fault-assisted reconciliation attack, demonstrating its effectiveness in extracting the secret key space via solving a quadratic polynomial system. Simulations show transient faults in key signature generation steps can significantly compromise SNOVA’s security. To address these vulnerabilities, we propose a lightweight countermeasure to reduce the success of fault attacks without adding significant overhead. Our results highlight the importance of fault-resistant mechanisms in post-quantum cryptographic schemes like SNOVA to ensure robustness.
Single Trace Side-Channel Attack on the MPC-in-the-Head Framework
In this paper, we present the first single trace side-channel attack that targets the MPC-in-the-Head (MPCitH) framework based on threshold secret sharing, also known as Threshold Computation in the Head (TCitH) in its original version. This MPCitH framework can be found in 5 of the 14 digital signatures schemes in the recent second round of the National Institute of Standards and Technology (NIST) call for digital signatures. In this work, we start by highlighting a side-channel vulnerability of the TCitH framework and show an exploitation of it on the SDitH algorithm, which is part of this NIST call. Specifically, we exploit the leakage of a multiplication function in the Galois field to make predictions about intermediate values, and we use the structure of the algorithm to combine information efficiently. This allows us to build an attack that is both the first Soft Analytical Side-Channel Attack (SASCA) targeting the MPCitH framework, as well as the first attack on SDitH. More specifically, we build a SASCA based on Belief Propagation (BP) on the evaluation of polynomials in the signature using the threshold variant structure to reconstruct the secret key. We perform simulated attacks under the Hamming Weight (HW) leakage model, enabling us to evaluate the resistance of the scheme against SASCA. We then perform our attacks in a real case scenario, more specifically on the STM32F407, and recover the secret key for all the security levels. We end this paper by discussing the various shuffling countermeasures we could use to mitigate our attacks.
Efficient Verifiable Differential Privacy with Input Authenticity in the Local and Shuffle Model
Local differential privacy (LDP) enables the efficient release of aggregate statistics without having to trust the central server (aggregator), as in the central model of differential privacy, and simultaneously protects a client's sensitive data. The shuffle model with LDP provides an additional layer of privacy, by disconnecting the link between clients and the aggregator. However, LDP has been shown to be vulnerable to malicious clients who can perform both input and output manipulation attacks, i.e., before and after applying the LDP mechanism, to skew the aggregator's results. In this work, we show how to prevent malicious clients from compromising LDP schemes. Our only realistic assumption is that the initial raw input is authenticated; the rest of the processing pipeline, e.g., formatting the input and applying the LDP mechanism, may be under adversarial control. We give several real-world examples where this assumption is justified. Our proposed schemes for verifiable LDP (VLDP), prevent both input and output manipulation attacks against generic LDP mechanisms, requiring only one-time interaction between client and server, unlike existing alternatives [37, 43]. Most importantly, we are the first to provide an efficient scheme for VLDP in the shuffle model. We describe, and prove security of, two schemes for VLDP in the local model, and one in the shuffle model. We show that all schemes are highly practical, with client run times of less than 2 seconds, and server run times of 5-7 milliseconds per client.
Compactly Committing Authenticated Encryption Using Encryptment and Tweakable Block Cipher
Facebook introduced message franking to enable users to report abusive content verifiably in end-to-end encrypted messaging. Grubbs et al. formalized the underlying primitive called compactly committing authenticated encryption with associated data (ccAEAD) and presented schemes with provable security. Dodis et al. proposed a core building block called encryptment and presented a generic construction of ccAEAD with encryptment and standard AEAD. This paper first proposes to use a tweakable block cipher instead of AEAD for the generic construction of Dodis et al. In the security analysis of the proposed construction, its ciphertext integrity is shown to require a new but feasible assumption on the ciphertext integrity of encryptment. Then, this paper formalizes remotely keyed ccAEAD (RK ccAEAD) and shows that the proposed construction works as RK ccAEAD. Finally, the confidentiality of the proposed construction as RK ccAEAD is shown to require a new variant of confidentiality for encryptment. The problem of remotely keyed encryption was posed by Blaze in 1996. It is now related to the problem of designing a cryptographic scheme using a trusted module and/or with leakage resiliency.
A Formal Treatment of Envelope Encryption
Envelope encryption is a method to encrypt data with two distinct keys in its basic form. Data is first encrypted with a data-encryption key, and then the data-encryption key is encrypted with a key-encryption key. Despite its deployment in major cloud services, as far as we know, envelope encryption has not received any formal treatment. To address this issue, we first formalize the syntax and security requirements of envelope encryption in the symmetric-key setting. Then, we show that it can be constructed by combining encryptment and authenticated encryption with associated data (AEAD). Encryptment is one-time AEAD satisfying that a small part of a ciphertext works as a commitment to the corresponding secret key, message, and associated data. Finally, we show that the security of the generic construction is reduced to the security of the underlying encryptment and AEAD.
THOR: Secure Transformer Inference with Homomorphic Encryption
As language models are increasingly deployed in cloud environments, privacy concerns have become a significant issue. To address this, we design THOR, a secure inference framework for transformer models on encrypted data. Specifically, we first propose new fast matrix multiplication algorithms based on diagonal-major order encoding and extend them to parallel matrix computation through the compact ciphertext packing technique. Second, we design efficient protocols for secure computations of four non-linear functions such as softmax, LayerNorm, GELU, and Tanh, by integrating advanced underlying approximation methods with tailored optimizations. Our matrix multiplication algorithms reduce the number of key-switching operations in the linear layers of the attention block in the BERT-base model by up to 14.5x, compared to the state-of-the-art HE-based secure inference protocol (Park et al., Preprint). Combined with cryptographic optimizations, our experimental results demonstrate that THOR provides secure inference for the BERT-base model with a latency of 10.43 minutes on a single GPU, while maintaining comparable inference accuracy on the MRPC dataset.
Cryptography Experiments In Lean 4: SHA-3 Implementation
In this paper we explain how we implemented the Secure Hash Algorithm-3 (SHA-3) family of functions in Lean 4, a functional programming language and theorem prover. We describe how we used several Lean facilities including type classes, dependent types, macros, and formal verification, and then refined the design to provide a simple one-shot and streaming API for hashing, and Extendable-output functions (XOFs), to reduce potential for misuse by users, and formally prove properties about the implementation.
A Closer Look at the Belief Propagation Algorithm in Side-Channel-Assisted Chosen-Ciphertext Attacks
The implementation security of post-quantum cryptography (PQC) algorithms has emerged as a critical concern with the PQC standardization process reaching its end. In a side-channel-assisted chosen-ciphertext attack, the attacker builds linear inequalities on secret key components and uses the belief propagation (BP) algorithm to solve. The number of inequalities leverages the query complexity of the attack, so the fewer the better. In this paper, we use the PQC standard algorithm CRYSTALS-Kyber as a study case to construct bilateral inequalities on key variables with substantially narrower intervals using a side-channel-assisted oracle. For Kyber512, Kyber768, and Kyber 1024, the average Shannon entropy carried by such inequality is improved from the previous 0.6094, 0.4734, and 0.8544 to 0.6418, 0.4777, and 1.2007. The number of such inequalities required to recover the key utilizing the BP algorithm for Kyber512 and Kyber1024 is reduced by 5.32% and 40.53% in theory and experimentally the reduction is even better. The query complexity is reduced by 43%, 37%, and 48% for Kyber512, 768, and 1024 assuming reasonably perfect reliability.
Furthermore, we introduce a strategy aimed at further refining the interval of inequalities. Diving into the BP algorithm, we discover a measure metric named JSD (Jensen-Shannon distance)-metric that can gauge the tightness of an inequality. We then develop a machine learning-based strategy to utilize the JSD-metrics to contract boundaries of inequalities even with fewer inequalities given, thus improving the entropy carried by the system of linear inequalities. This contraction strategy is at the algorithmic level and has the potential to be employed in all attacks endeavoring to establish a system of inequalities concerning key variables.
Free-XOR Gate Bootstrapping
The FHEW-like gate bootstrapping framework operates in a 2-bit plaintext space, where logic gates such as NAND, XOR, and AND are implemented by adding two ciphertexts and extracting the most significant bit. However, each gate operation requires bootstrapping with a primary cost of one blind rotation, which is expensive, when processing circuit operations for applications. We propose a novel Free-XOR gate bootstrapping framework based on a single-bit plaintext space, in which the XOR operation is realized by simply adding two ciphertexts, resulting in an almost free computational cost. To form a minimal complete set for logical operations, we design an algorithm for the AND gate within this framework. The AND gate cost of our Free-XOR gate bootstrapping involves two blind rotations. However, by utilizing a single-bit plaintext space to enhance noise tolerance and swapping some operations of the bootstrapping process, we can adopt a more compact parameter setting, which in turn accelerates the speed of blind rotation. We propose an instantiation of the NTRU-based AND gate operation, which requires two blind rotations. Despite the additional rotation, the overall computational cost is marginally lower than the state-of-the-art gate bootstrapping scheme LLW+ [TCHES24], which utilizes only a single blind rotation. In addition, our approach achieves a significant reduction in key size, reducing it to 3.3 times the size of LLW+ [TCHES24].
How to Redact the Bitcoin Backbone Protocol
We explain how to extend the Bitcoin backbone model of Garay et al. (Eurocrypt, 2015) to accommodate for redactable blockchains. Our extension captures fluid blockchain-based databases (with mutability requirements) and compliance with existing legislation, such as the GDPR right to be forgotten, or the need to erase offending data from nodes’ databases that would otherwise provoke legal shutdowns. Our redactable backbone protocol retains the essential properties of blockchains. Leveraging zero-knowledge proofs, old data can be erased without requiring trusted third parties or heuristics about past chain validation. Our solution can be implemented on Bitcoin immediately without hard-forks, and it is scalable. It allows the redaction of data from UTXOs or unconfirmed transactions that have not yet flooded the network, while guaranteeing invariance of the Bitcoin state. Thus, offending data does not need to persist in the system, not even temporarily.
A Practical Protocol for Quantum Oblivious Transfer from One-Way Functions
We present a new simulation-secure quantum oblivious transfer (QOT) protocol based on one-way functions in the plain model. With a focus on practical implementation, our protocol surpasses prior works in efficiency, promising feasible experimental realization. We address potential experimental errors and their correction, offering analytical expressions to facilitate the analysis of the required quantum resources. Technically, we achieve simulation security for QOT through an equivocal and relaxed-extractable quantum bit commitment.
Symmetric Twin Column Parity Mixers and their Applications
The circulant twin column parity mixer (TCPM) is a type of mixing layer for the round function of cryptographic permutations designed by Hirch et al. at CRYPTO 2023. It has a bitwise differential branch number of 12 and a bitwise linear branch number of 4, which makes it competitive in applications where differential security is required. Hirch et al. gave a concrete instantiation of a permutation using such a mixing layer, named Gaston, and showed the best 3-round differential and linear trails of Gaston have much higher weights than those of ASCON. In this paper, we first prove why the TCPM has linear branch number 4 and then show that Gaston's linear behavior is worse than ASCON for more than 3 rounds. Motivated by these facts, we aim to enhance the linear security of the TCPM. We show that adding a specific set of row cyclic shifts to the TCPM can make its differential and linear branch numbers both 12. Notably, by setting a special relationship between the row shift parameters of the modified TCPM, we obtain a special kind of mixlayer called the symmetric circulant twin column parity mixer. The symmetric TCPM has a unique design property that its differential and linear branch histograms are the same, which makes the parameter selection process and the security analysis convenient. Using the symmetric TCPM, we present two new 320-bit cryptographic permutations, namely (1) Gaston-S where we replace the mixing layer in Gaston with the symmetric TCPM and (2) SBD which uses a low-latency degree-4 S-box as the non-linear layer and the symmetric TCPM as the mixing layer. We evaluate the security of these permutations considering differential, linear and algebraic analysis, and then provide the performance comparison with Gaston in both hardware and software. Our results indicate that Gaston-S and SBD are competitive with Gaston in both security and performance.
Practical Zero-Knowledge PIOP for Public Key and Ciphertext Generation in (Multi-Group) Homomorphic Encryption
Homomorphic encryption (HE) is a foundational technology in privacy-enhancing cryptography, enabling non-interactive computation over encrypted data. Recently, generalized HE primitives designed for multi-party applications, such as multi-group HE (MGHE), have gained significant research interest.
While constructing secure multi-party protocols from (MG)HE in the semi-honest model is straightforward, zero-knowledge techniques are essential for ensuring security against malicious adversaries.
In this work, we design practical proof systems for MGHE to guarantee the well-formedness of public keys and ciphertexts. Specifically, we develop and optimize a polynomial interactive oracle proof (PIOP) for MGHE, which can be compiled into zk-SNARKs using a polynomial commitment scheme (PCS).
We compile our PIOP using a lattice-based PCS, and our implementation achieves a 5.5x reduction in proof size, a 70x speed-up in proof generation, and a 343x improvement in verification time compared to the previous state-of-the-art construction, PELTA (ACM CCS 2023). Additionally, our PIOPs are modular, enabling the use of alternative PCSs to optimize other aspects, such as further reducing proof sizes.
On the vector subspaces of $\mathbb{F}_{2^n}$ over which the multiplicative inverse function sums to zero
We study the behavior of the multiplicative inverse function (which plays an important role in cryptography and in the study of finite fields), with respect to a recently introduced generalization of almost perfect nonlinearity (APNness), called $k$th-order sum-freedom, that extends a classic characterization of APN functions, and has also some relationship with integral attacks. This generalization corresponds to the fact that a vectorial function $F:\mathbb F_2^n\mapsto \mathbb F_2^m$ sums to a nonzero value over every $k$-dimensional affine subspace of $\mathbb F_2^n$, for some $k\leq n$ (APNness corresponds to $k=2$). The sum of the values of the inverse function $x\in \mathbb F_{2^n}\mapsto x^{2^n-2}\in \mathbb F_{2^n}$ over any affine subspace $A$ of $\mathbb{F}_{2^n}$ not containing 0 ({\em i.e.} being not a vector space) has been addressed, thanks to a simple expression of such sum, which shows that it never vanishes. We study in the present paper the case of vector (i.e. linear) subspaces, which is much less simple to handle. The sum depends on a coefficient in subspace polynomials.
We study for which values of $k$ the multiplicative inverse function can sum to nonzero values over all $k$-dimensional vector subspaces. We show that, for every $k$ not co-prime with $n$, it sums to zero over at least one $k$-dimensional $\mathbb{F}_2$-subspace of $\mathbb{F}_{2^n}$. We study the behavior of the inverse function over direct sums of vector spaces and we deduce that the property of the inverse function to be $k$th-order sum-free happens for $k$ if and only if it happens for $n-k$. We derive several other results and we show that the set of values $k$ such that the inverse function is not $k$th-order sum-free is stable when adding two values of $k$ whose product is smaller than $n$ (and when subtracting two values under some conditions). We clarify the case of dimension at most 4 (equivalently, of co-dimension at most 4) and this allows to address, for every $n$, all small enough values of $k$ of the form $3a+4b$.
Tighter Security for Group Key Agreement in the Random Oracle Model
The Messaging Layer Security (MLS) protocol, recently standardized in RFC 9420, aims to provide efficient asynchronous group key establishment with strong security guarantees. The main component of MLS, which is the source of its important efficiency and security properties, is a protocol called TreeKEM. Given that a major vision for the MLS protocol is for it to become the new standard for messaging applications like WhatsApp, Facebook Messenger, Signal, etc., it has the potential to be used by a huge number of users. Thus, it is important to better understand the security of MLS and hence also of TreeKEM. In a previous work by Klein et. al, TreeKEM was proven adaptively secure in the Random Oracle Model (ROM) with a polynomial loss in security by proving a result about the security of an arbitrary IND-CPA secure public-key encryption scheme in a public-key version of the Generalized Selective Decryption (GSD) security game.
In this work, we prove a tighter bound for the security of TreeKEM. We follow the approach in the aforementioned work and first introduce a modified version of the public-key GSD game better suited for analyzing TreeKEM. We then provide a simple and detailed proof of security for a specific encryption scheme, the DHIES scheme (currently the only standardized scheme in MLS), in this game in the ROM and achieve a tighter bound compared to the result from Klein et. al. We also define and describe the syntax and security of TreeKEM-like schemes and state a result linking the security of TreeKEM with security in our GSD game in the ROM.
On the Black-Box Complexity of Private-Key Inner-Product Functional Encryption
We initiate the study of the black-box complexity of private-key functional encryption (FE). Of central importance in the private-key setting is the inner-product functionality, which is currently only known from assumptions that imply public-key encryption, such as Decisional Diffie-Hellman or Learning-with-Errors. As our main result, we rule out black-box constructions of private-key inner-product FE from random oracles. This implies a black-box separation between private-key inner-product FE from all symmetric-key primitives implied by random oracles (e.g., symmetric-key encryption and collision-resistant hash functions).
Proving lower bounds for private-key functional encryption schemes introduces challenges that were absent in prior works. In particular, the combinatorial techniques developed by prior works for proving black-box lower bounds are only useful in the public-key setting and predicate encryption settings, which all fail for the private-key FE case. Our work develops novel combinatorial techniques based on Fourier analysis to overcome these barriers. We expect these techniques to be widely useful in future research in this area.
On the Relationship between Public Key Primitives via Indifferentiability
Recently, Masny and Rindal [MR19] formalized a notion called Endemic Oblivious Transfer (EOT), and they proposed a generic transformation from Non-Interactive Key Exchange (NIKE) to EOT with standalone security in the random oracle (RO) model. However, from the model level, the relationship between idealized NIKE and idealized EOT and the relationship between idealized elementary public key primitives have been rarely researched.
In this work, we investigate the relationship between ideal NIKE and ideal one-round EOT, as well as the relationship between ideal public key encryption (PKE) and ideal two-round Oblivious Transfer (OT), in the indifferentiability framework proposed by Maurer et al.(MRH04). Our results are threefold: Firstly, we model ideal PKE without public key validity test, ideal one-round EOT and ideal two-round OT in the indifferentiability framework. Secondly, we show that ideal NIKE and ideal one-round EOT are equivalent, and ideal PKE without public key validity test are equivalent to ideal two-round OT. Thirdly, we show a separation between ideal two-round OT and ideal one-round EOT, which implies a separation between ideal PKE and ideal NIKE.
Randomness Bounds for Private Simultaneous Messages and Conditional Disclosure of Secrets
In cryptography, the private simultaneous messages (PSM) and conditional disclosure of secrets (CDS) are closely related fundamental primitives. We consider $k$-party PSM and CDS protocols for a function $f$ with a common random string, where each party $P_i$ generates a message and sends it to a referee $P_0$. We consider bounds for the optimal length $\rho$ of the common random string among $k$ parties (or, {\it randomness complexity}) in PSM and CDS protocols with perfect and statistical privacy through combinatorial and entropic arguments. ($i$) We provide general connections from the optimal total length $\lambda$ of the messages (or, {\it communication complexity}) to the randomness complexity $\rho$. ($ii$) We also prove randomness lower bounds in PSM and CDS protocols for general functions. ($iii$) We further prove randomness lower bounds for several important explicit functions. They contain the following results: For PSM protocols with perfect privacy, we prove $\lambda-1 \le \rho$ as the general connection. From the general lower bound, we prove $\rho\ge 2^{(k-1)n}-1$ for a general function $f:(\{0,1\}^n)^k\rightarrow\{0,1\}$ under universal reconstruction, in which $P_0$ is independent of $f$. This implies that the Feige-Killian-Naor PSM protocol for a general function [Proc.~STOC '94, pp.554--563] is optimal with respect to randomness complexity. We also provide a randomness lower bound $\rho> kn-2$ for a generalized inner product function. This implies the optimality of the $2$-party PSM protocol for the inner-product function of Liu, Vaikuntanathan, and Wee [Proc.~CRYPTO 2017, pp.758--790]. For CDS protocols with perfect privacy, we show $\rho\ge\lambda-\sigma$ as the general connection by similar argument to those for PSM protocols, where $\sigma$ is the length of secrets. We also obtain randomness lower bounds $\rho\ge (k-1)\sigma$ for XOR, AND, and generalized inner product functions. These imply the optimality of Applebaum and Arkis's $k$-party CDS protocol for a general function [Proc. TCC 2018, pp.317--344] up to a constant factor in a large $k$.
Impossible Boomerang Attacks Revisited: Applications to Deoxys-BC, Joltik-BC and SKINNY
The impossible boomerang (IB) attack was first introduced by Lu in his doctoral thesis and subsequently published at DCC in 2011. The IB attack is a variant of the impossible differential (ID) attack by incorporating the idea of the boomerang attack. In this paper, we revisit the IB attack, and introduce the incompatibility of two characteristics in boomerang to the construction of an IB distinguisher. With our methodology, all the constructions of IB distinguisher are represented in a unified manner. Moreover, we show that the related-(twea)key IB distinguishers possess more freedom than the ones of ID so that it can cover more rounds.
We also propose a new tool based on Mixed-Integer Quadratically-Constrained Programming (MIQCP) to search for IB attacks. To illustrate the power of the IB attack, we mount attacks against three tweakable block ciphers: Deoxys-BC, Joltik-BC and SKINNY. For Deoxys-BC, we propose a related-tweakey IB attack on 14-round Deoxys-BC-384, which improves the best previous related-tweakey ID attack by 2 rounds, and we improve the data complexity of the best previous related-tweakey ID attack on 10-round Deoxys-BC-256. For Joltik-BC, we propose the best attacks against 10-round Joltik-BC-128 and 14-round Joltik-BC-192 with related-tweakey IB attack. For SKINNY-n-3n, we propose a 27-round related-tweakey IB attack, which improves both the time and the memory complexities of the best previous ID attack. We also propose the first related-tweakey IB attack on 28 round SKINNY-n-3n, which improves the previous best ID attack by one round.
Unbounded Leakage-Resilient Encryption and Signatures
Given the devastating security compromises caused by side-channel attacks on existing classical systems, can we store our private data encoded as a quantum state so that they can be kept private in the face of arbitrary side-channel attacks?
The unclonable nature of quantum information allows us to build various quantum protection schemes for cryptographic information such as secret keys. Examples of quantum protection notions include copy-protection, secure leasing, and finally, unbounded leakage-resilience, which was recently introduced by Çakan, Goyal, Liu-Zhang and Ribeiro (TCC'24). Çakan et al show that secrets of various cryptographic schemes (such as cryptographic keys or secret shares) can be protected by storing them as quantum states so that they satisfy LOCC (local operation and classical communication) leakage-resilience: the scheme can tolerate any unbounded amount of adaptive leakage over unbounded rounds. As a special case (dubbed $1$-round leakage), this also means that those quantum states cannot be converted to classical strings (without completely losing their functionality).
In this work, we continue the study of unbounded/LOCC leakage-resilience and consider several new primitive. In more details, we build ciphertexts, signatures and non-interactive zero-knowledge proofs with unbounded leakage-resilience. We show the following results.
- Assuming the existence of a classical $X \in \{\text{secret-key encryption}, \text{public-key encryption}\}$ scheme, we construct an $X$ scheme with LOCC leakage-resilient ciphertexts. This guarantees that an adversary who obtains LOCC-leakage on ciphertexts cannot learn anything about their contents, even if they obtain the secret key later on.
- Assuming the existence of a classical signature scheme and indistinguishability obfuscation (iO), we construct a signature scheme with LOCC leakage-resilient signatures. This guarantees that an adversary who obtains LOCC-leakage on various signatures cannot produce any valid signatures at all other than the ones it obtained honestly!
- Assuming the existence of one-way functions and indistinguishability obfuscation (iO), we construct a NIZK proof system with LOCC leakage-resilient proofs. This guarantees that an adversary who obtains LOCC-leakage on a NIZK proof of an hard instance cannot produce a valid proof!
mUOV: Masking the Unbalanced Oil and Vinegar Digital Sigital Signature Scheme at First- and Higher-Order
The National Institute for Standards and Technology (NIST) initiated a standardization procedure for additional digital signatures and recently announced round-2 candidates for the PQ additional digital signature schemes. The multivariate digital signature scheme Unbalanced Oil and Vinegar (UOV) is one of the oldest post-quantum schemes and has been selected by NIST for Round 2. Although UOV is mathematically secure, several side-channel attacks (SCA) have been shown on the UOV or UOV-based digital signatures. We carefully analyze the sensitivity of variables and operations in the UOV scheme from the side-channel perspective and show which require protection.
To mitigate implementation-based SCA, we integrate a provably secure arbitrary-order masking technique with the key generation and signature generation algorithms of UOV. We propose efficient techniques for the masked dot-product and matrix-vector operations, which are both critical in multivariate DS schemes. We also implemented and demonstrate the practical feasibility of our masking algorithms for UOV-Ip on the ARM Cortex-M4 microcontroller. Our first-order masked UOV implementations have $2.7\times$ and $3.6\times$ performance overhead compared to the unmasked scheme for key generation and signature generation algorithms. Our first-order masked UOV implementations use $1.3\times$ and $1.9\times$ stack memory rather than the unmasked version of the key generation and signature generation algorithms.
Somewhat Homomorphic Encryption from Linear Homomorphism and Sparse LPN
We construct somewhat homomorphic encryption schemes from the learning sparse parities with noise (sparse LPN) problem, along with an assumption that implies linearly homomorphic encryption (e.g., the decisional Diffie-Hellman or decisional composite residuosity assumptions). Our resulting schemes support an a-priori bounded number of homomorphic operations: $O(\log \lambda/\log \log \lambda)$ multiplications followed by poly($\lambda$) additions, where $\lambda \in \mathbb{N}$ is a security parameter. These schemes have compact ciphertexts: after homomorphic evaluation, the bit-length of each ciphertext is a fixed polynomial in the security parameter $\lambda$, independent of the number of homomorphic operations applied to it. This gives the first somewhat homomorphic encryption schemes that can evaluate the class of bounded-degree polynomials with a bounded number of monomials without relying on lattice assumptions or bilinear maps.
Much like in the Gentry-Sahai-Waters fully homomorphic encryption scheme, ciphertexts in our scheme are matrices, homomorphic addition is matrix addition, and homomorphic multiplication is matrix multiplication. Moreover, when encrypting many messages at once and performing many homomorphic evaluations at once, the bit-length of ciphertexts in some of our schemes (before and after homomorphic evaluation) can be arbitrarily close to the bit-length of the plaintexts. The main limitation of our schemes is that they require a large evaluation key, whose size scales with the complexity of the homomorphic computation performed, though this key can be re-used across any polynomial number of encryptions and evaluations.
Distributed Differential Privacy via Shuffling vs Aggregation: a Curious Study
How to achieve distributed differential privacy (DP) without a trusted central party is of great interest in both theory and practice. Recently, the shuffle model has attracted much attention. Unlike the local DP model in which the users send randomized data directly to the data collector/analyzer, in the shuffle model an intermediate untrusted shuffler is introduced to randomly permute the data, which have already been randomized by the users, before they reach the analyzer. The most appealing aspect is that while shuffling does not explicitly add more noise to the data, it can make privacy better. The privacy amplification effect in consequence means the users need to add less noise to the data than in the local DP model, but can achieve the same level of differential privacy. Thus, protocols in the shuffle model can provide better accuracy than those in the local DP model. What looks interesting to us is that the architecture of the shuffle model is similar to private aggregation, which has been studied for more than a decade. In private aggregation, locally randomized user data are aggregated by an intermediate untrusted aggregator. Thus, our question is whether aggregation also exhibits some sort of privacy amplification effect? And if so, how good is this ``aggregation model'' in comparison with the shuffle model. We conducted the first comparative study between the two, covering privacy amplification, functionalities, protocol accuracy, and practicality. The results as yet suggest that the new shuffle model does not have obvious advantages over the old aggregation model. On the contrary, protocols in the aggregation model outperform those in the shuffle model, sometimes significantly, in many aspects.
The Learning Stabilizers with Noise problem
Random classical codes have good error correcting properties, and yet they are notoriously hard to decode in practice. Despite many decades of extensive study, the fastest known algorithms still run in exponential time. The Learning Parity with Noise (LPN) problem, which can be seen as the task of decoding a random linear code in the presence of noise, has thus emerged as a prominent hardness assumption with numerous applications in both cryptography and learning theory.
Is there a natural quantum analog of the LPN problem? In this work, we introduce the Learning Stabilizers with Noise (LSN) problem, the task of decoding a random stabilizer code in the presence of local depolarizing noise. We give both polynomial-time and exponential-time quantum algorithms for solving LSN in various depolarizing noise regimes, ranging from extremely low noise, to low constant noise rates, and even higher noise rates up to a threshold. Next, we provide concrete evidence that LSN is hard. First, we show that LSN includes LPN as a special case, which suggests that it is at least as hard as its classical counterpart. Second, we prove a worst-case to average-case reduction for variants of LSN. We then ask: what is the computational complexity of solving LSN? Because the task features quantum inputs, its complexity cannot be characterized by traditional complexity classes. Instead, we show that the LSN problem lies in a recently introduced (distributional and oracle) unitary synthesis class. Finally, we identify several applications of our LSN assumption, ranging from the construction of quantum bit commitment schemes to the computational limitations of learning from quantum data.
Masking Gaussian Elimination at Arbitrary Order, with Application to Multivariate- and Code-Based PQC
Digital signature schemes based on multivariate- and code-based hard problems are promising alternatives for lattice-based signature schemes, due to their smaller signature size. Hence, several candidates in the ongoing additional standardization for quantum secure digital signature (DS) schemes by the National Institute of Standards and Technology (NIST) rely on such alternate hard problems. Gaussian Elimination (GE) is a critical component in the signing procedure of these schemes. In this paper, we provide a masking scheme for GE with back substitution to defend against first- and higher-order attacks. To the best of our knowledge, this work is the first to analyze and propose masking techniques for multivariate- or code-based DS algorithms.
We propose a masked algorithm for transforming a system of linear equations into row-echelon form. This is realized by introducing techniques for efficiently making leading (pivot) elements one while avoiding costly conversions between Boolean and multiplicative masking at all orders. We also propose a technique for efficient masked back substitution, which eventually enables a secure unmasking of the public output. We evaluate the overhead of our countermeasure for several post-quantum candidates and their different security levels at first-, second-, and third-order, including UOV, MAYO, SNOVA, QR-UOV, and MQ-Sign. Notably, the operational cost of first-, second-, and third-order masked GE is 2.3$\times$ higher, and the randomness cost is 1.2$\times$ higher in MAYO compared to UOV for security levels III and V. In contrast, these costs are similar in UOV and MAYO for one version of level I. We also show detailed performance results for masked GE implementations for all three security versions of UOV on the Arm Cortex-M4 and compare them with unmasked results. Our first-order implementations targeting UOV parameters have overheads of factor 6.5$\times$, 5.9$\times$, and 5.7$\times$ compared to the unprotected implementation for NIST security level I, III, and V.
Multi-Holder Anonymous Credentials from BBS Signatures
The eIDAS 2.0 regulation aims to develop interoperable digital identities for European citizens, and it has recently become law. One of its requirements is that credentials be unlinkable. Anonymous credentials (AC) allow holders to prove statements about their identity in a way that does not require to reveal their identity and does not enable linking different usages of the same credential. As a result, they are likely to become the technology that provides digital identity for Europeans.
Any digital credential system, including anonymous credentials, needs to be secured against identity theft and fraud. In this work, we introduce the notion of a multi-holder anonymous credential scheme that allows issuing shares of credentials to different authentication factors (or ``holders''). To present the credential, the user's authentication factors jointly run a threshold presentation protocol. Our definition of security requires that the scheme provide unforgeability: the adversary cannot succeed in presenting a credential with identity attributes that do not correspond to an identity for which the adversary controls at least $t$ shares; this is true even if the adversary can obtain credentials of its choice and cause concurrent executions of the presentation protocol. Further, our definition requires that the presentation protocol provide security with identifiable abort. Finally, presentations generated by all honest holders must be unlinkable and must not reveal the user's secret identity attributes even to an adversary that controls some of the user's authentication factors.
We design and prove the (concurrent) security of a multi-holder version of the BBS anonymous credential scheme. In our construction, each holder is issued a secret share of a BBS credential.
Using these shares, the holders jointly compute a credential presentation that is identical to (and therefore compatible with) the traditional, single-holder variant (due to Tessaro and Zhu, Eurocrypt'23) of a BBS credential presentation.
$\mathsf{Cirrus}$: Performant and Accountable Distributed SNARK
As Succinct Non-interactive Arguments of Knowledge (SNARKs) gain traction for large-scale applications, distributed proof generation is a promising technique to horizontally scale up the performance. In such protocols, the workload to generate SNARK proofs is distributed among a set of workers, potentially with the help of a coordinator. Desiderata include linear worker time (in the size of their sub-tasks), low coordination overhead, low communication complexity, and accountability (the coordinator can identify malicious workers). State-of-the-art schemes, however, do not achieve these properties.
In this paper, we introduced $\mathsf{Cirrus}$, the first accountable distributed proof generation protocol with linear computation complexity for all parties. $\mathsf{Cirrus}$ is based on HyperPlonk (EUROCRYPT'23) and therefore supports a universal trusted setup.
$\mathsf{Cirrus}$ is horizontally scalable: proving statements about a circuit of size $O(MT)$ takes $O(T)$ time with $M$ workers. The per-machine communication cost of $\mathsf{Cirrus}$ is low, which is only logarithmic in the size of each sub-circuit. $\mathsf{Cirrus}$ is also accountable, and the verification overhead of the coordinator is efficient. We further devised a load balancing technique to make the workload of the coordinator independent of the size of each sub-circuit.
We implemented an end-to-end prototype of $\mathsf{Cirrus}$ and evaluated its performance on modestly powerful machines. Our results confirm the horizontal scalability of $\mathsf{Cirrus}$, and the proof generation time for circuits with $2^{25}$ gates is roughly $40$s using $32$ $8$-core machines. We also compared $\mathsf{Cirrus}$ with Hekaton (CCS'24), and $\mathsf{Cirrus}$ is faster when proving PLONK-friendly circuits such as Pedersen hash.
ColliderScript: Covenants in Bitcoin via 160-bit hash collisions
We introduce a method for enforcing covenants on Bitcoin outputs without requiring any changes to Bitcoin by designing a hash collision based equivalence check which bridges Bitcoin's limited Big Script to Bitcoin's Small Script. This allows us evaluate the signature of the spending transaction (available only to Big Script) in Small Script. As Small Script enables arbitrary computations, we can introspect into the spending transaction and enforce covenants on it.
Our approach leverages finding collisions in the $160$-bit hash functions: SHA-1 and RIPEMD-160. By the birthday bound this should cost $\sim2^{80}$ work. Each spend of our covenant costs $\sim2^{86}$ hash queries and $\sim2^{56}$ bytes of space. For security, we rely on an assumption regarding the hardness of finding a $3$-way collision (with short random inputs) in $160$-bit hash functions, arguing that if the assumption holds, breaking covenant enforcement requires $\sim2^{110}$ hash queries. To put this in perspective, the work to spend our covenant is $\sim33$ hours of the Bitcoin mining network, whereas breaking our covenant requires $\sim 450,000$ years of the Bitcoin mining network.
We believe there are multiple directions of future work that can significantly improve these numbers.
Evaluating covenants and our equivalence check requires performing many operations in Small Script, which must take no more than $4$ megabytes in total size, as Bitcoin does not allow transactions greater than $4$ megabytes. We only provide rough estimates of the transaction size because, as of this writing, no Small Script implementations of the hash functions required, SHA-1 and RIPEMD-160, have been written.
Amigo: Secure Group Mesh Messaging in Realistic Protest Settings
In large-scale protests, a repressive government will often disable the Internet to thwart communication between protesters. Smartphone mesh networks, which route messages over short range, possibly ephemeral, radio connections between nearby phones, allow protesters to communicate without relying on centralized Internet infrastructure. Unfortunately, prior work on mesh networks does not efficiently support cryptographically secure group messaging (a crucial requirement for protests); prior networks were also evaluated in unrealistically benign network environments which fail to accurately capture the link churn and physical spectrum contention found in realistic protest environments. In this paper, we introduce Amigo, an anonymous mesh messaging system which supports group communication through continuous key agreement, and forwards messages using a novel routing protocol designed to handle the challenges of ad-hoc routing scenarios. Our extensive simulations reveal the poor scalability of prior approaches, the benefits of Amigo's protest-specific optimizations, and the challenges that still must be solved to scale secure mesh networks to protests with thousands of participants.
Field-Agnostic SNARKs from Expand-Accumulate Codes
Efficient realizations of succinct non-interactive arguments of knowledge (SNARKs) have gained popularity due to their practical applications in various domains. Among existing schemes, those based on error-correcting codes are of particular interest because of their good concrete efficiency, transparent setup, and plausible post-quantum security. However, many existing code-based SNARKs suffer from the
disadvantage that they only work over specific finite fields.
In this work, we construct a code-based SNARK that does not rely on any specific underlying field; i.e., it is field-agnostic. Our construction follows the framework of Brakedown (CRYPTO '23) and builds a polynomial commitment scheme (and hence a SNARK) based on recently introduced expand-accumulate codes. Our work generalizes these codes to arbitrary finite fields; our main technical contribution is showing that, with high probability, these codes have constant rate and constant relative distance (crucial properties for building efficient SNARKs), solving an open problem from prior work.
As a result of our work we obtain a SNARK where, for a statement of size $M$ , the prover time is $O(M \log M )$ and the proof size is $O(\sqrt{M} )$. We demonstrate the concrete efficiency of our scheme empirically via experiments. Proving ECDSA verification on the secp256k1 curve requires only 0.23s for proof generation, 2 orders of magnitude faster than SNARKs that are not field-agnostic. Compared to the original Brakedown result (which is also field-agnostic), we obtain proofs that are 1.9–2.8$\times$ smaller due to the good concrete distance of our underlying error-correcting code, while introducing only a small overhead of 1.2$\times$ in the prover time.
Towards Practical Oblivious Map
Oblivious map (OMAP) is an important component in encrypted databases, utilized to safeguard against the server inferring sensitive information about client's encrypted key-value stores based on access patterns. Despite its widespread usage and importance, existing OMAP solutions face practical challenges, including the need for a large number of interaction rounds between the client and server, as well as the substantial communication bandwidth requirements. For example, the state-of-the-art protocol named OMIX++ in VLDB 2024 still requires $O(\log{n})$ interaction rounds and $O(\log^2{n})$ communication bandwidth per access, where $n$ denotes the total number of key-value pairs stored.
In this work, we introduce more practical and efficient OMAP constructions. Consistent with all prior OMAPs, our constructions also adapt only the tree-based Oblivious RAM (ORAM) and oblivious data structures (ODS) to achieve OMAP for enhanced practicality. In complexity, our approach needs $O(\log{n}/\log{\log{n})+O(\log{\lambda})}$ interaction rounds and $O(\log^2{n}/\log{\log{n}})+O(\log{\lambda}\log{n})$ communication bandwidth per data access where $\lambda$ is the security parameter. This new complexity results from our two main contributions. First, unlike prior works that rely solely on search trees, we design a novel framework for OMAP that combines hash table with search trees. Second, we propose a more efficient tree-based ORAM named DAORAM, which is of significant independent interest. This newly developed ORAM noticeably accelerates our constructions as it supports obliviously accessing hash tables much more efficiently. We implement both our proposed constructions and prior methods to experimentally demonstrate that our constructions substantially outperform prior methods in terms of efficiency.
Constructions of self-orthogonal codes and LCD codes from functions over finite fields
The construction of self-orthogonal codes from functions over finite fields has been widely studied in the literature. In this paper, we construct new families of self-orthogonal linear codes with few weights from trace functions and weakly regular plateaued functions over the finite fields of odd characteristics. We determine all parameters of the constructed self-orthogonal codes and their dual codes. Moreover, we employ the constructed $p$-ary self-orthogonal codes to construct $p$-ary LCD codes.
A Closer Look at Falcon
Falcon is a winner of NIST's six-year post-quantum cryptography standardisation competition. Based on the celebrated full-domain-hash framework of Gentry, Peikert and Vaikuntanathan (GPV) (STOC'08), Falcon leverages NTRU lattices to achieve the most compact signatures among lattice-based schemes.
Its security hinges on a Rényi divergence-based argument for Gaussian samplers, a core element of the scheme. However, the GPV proof, which uses statistical distance to argue closeness of distributions, fails when applied naively to Falcon due to parameter choices resulting in statistical distances as large as $2^{-34}$. Additional implementation-driven deviations from the GPV framework further invalidate the original proof, leaving Falcon without a security proof despite its selection for standardisation.
This work takes a closer look at Falcon and demonstrates that introducing a few minor, conservative modifications allows for the first formal proof of the scheme in the random oracle model. At the heart of our analysis lies an adaptation of the GPV framework to work with the Rényi divergence, along with an optimised method for parameter selection under this measure. Furthermore, we obtain a provable version of the GPV framework over NTRU rings. Both these tools may be of independent interest.
Unfortunately, our analysis shows that despite our modification of Falcon-512 and Falcon-1024 we do not achieve strong unforgeability for either scheme. For plain unforgeability we are able to show that our modifications to Falcon-512 barely satisfy the claimed 120-bit security target and for Falcon-1024 we confirm the claimed security level. As such we recommend revisiting falcon and its parameters.
Concretely Efficient Input Transformation Based Zero-Knowledge Argument System for Arbitrary Circuits
We introduce a new transparent zero-knowledge argument system based on the direct computation concept described in this paper. Our protocol converts input parameters into a format that any circuit can process directly. Once the circuit output can be computed using transformed inputs, the output of the polynomial a circuit generates can also be correctly computed by the verifier, allowing us to reduce the polynomial evaluation cost significantly.
In the default setting, the prover runtime cost for group exponentiation operations is only the square root of the degree ($\sqrt{p_d}$) of the polynomial the circuit generates. Furthermore, leveraging the ``merging through addition" and ``bootstrapping with breakers" techniques, the size of the polynomial our protocol generates can be much smaller than the total number of multiplicative operations. Our benchmark result shows our approach can significantly improve both prover runtime and verifier runtime performance over state-of-the-art by almost or over one order of magnitude while keeping the communication cost comparable with that of the state-of-the-art.
Our approach also allows our protocol to be made memory-efficient while being non-interactive. The theoretical memory cost of our protocol is $O(b)$, where $b = \sqrt{p_d}$ in the default setting. Setting the bootstrapping parameter ($b$) aggressively will result in better prover runtime performance at the expense of the higher communication cost.
Access-Controlled Inner Product Function-Revealing Encryption
We extend the concept of access control for functional encryption, introduced by Abdalla et al. (ASIACRYPT 2020), to function-revealing encryption (Joy and Passelègue, SCN 2018). Here “access control” means that function evaluation is only possible when a specified access policy is met. Specifically, we introduce access-controlled inner product function-revealing encryption (AC-IPFRE) and give two applications.
On the theoretical side, we use AC-IPFRE to show that function-hiding inner-product functional encryption (FH-IPFE), introduced by Bishop et al. (ASIACRYPT 2015), is equivalent to IPFRE. To show this, we in particular generically construct AC-IPFRE from IPFRE for the “non-zero inner-product” (NZIP) access policy. This result uses an effective version of Lagrange’s Four Square Theorem. One consequence of this result is that lower bounds by Ünal (EUROCRYPT 2020) suggest that, as for FH-IPFE, bilinear pairings will be needed to build IPFRE.
On the practical side, we build an outsourced approximate nearest-neighbor (ANN) search protocol and mitigate its leakage via AC-IPFRE. For this, we construct a practical AC-IPFRE scheme in the generic bilinear group model for a specific access policy for ANN search. To this end, we show that techniques of Wee (TCC 2020) implicitly give the most practical FH-IPFE scheme to date. We implement the resulting outsourced ANN search protocol and report on its performance.
Of independent interest, we show AC-IPFRE for NZIP implies attribute-hiding small-universe AC-IPFRE for arbitrary access policies. Previous work on access control for FE did not achieve attribute hiding. Overall, our results demonstrate that AC-IPFRE is of both theoretical and practical interest and set the stage for future work in the area.
A Hard-Label Cryptanalytic Extraction of Non-Fully Connected Deep Neural Networks using Side-Channel Attacks
During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects. However despite their high value and public accessibility, the protection of the intellectual property of DNNs is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings, proving that it was possible to copy a DNN with high fidelity, i.e., high similitude in the output predictions. However, the current cryptanalytic attacks cannot target complex, i.e., not fully connected, DNNs and are limited to special cases of neurons present in deep networks.
In this work, we introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity. We describe a new black-box side-channel attack which splits the DNN in several linear parts for which we can perform cryptanalytic extraction and retrieve the weights in hard-label settings. With this method, we are able to adapt cryptanalytic extraction, for the first time, to non-fully connected DNNs, while maintaining a high fidelity. We validate our contributions by targeting several architectures implemented on a microcontroller unit, including a Multi-Layer Perceptron (MLP) of 1.7 million parameters and a shortened MobileNetv1. Our framework successfully extracts all of these DNNs with high fidelity (88.4% for the MobileNetv1 and 93.2% for the MLP). Furthermore, we use the stolen model to generate adversarial examples and achieve close to white-box performance on the victim's model (95.8% and 96.7% transfer rate).
Black-box Collision Attacks on the NeuralHash Perceptual Hash Function
Perceptual hash functions map multimedia content that is perceptually close to outputs strings that are identical or similar. They are widely used for the identification of protected copyright and illegal content in information sharing services: a list of undesirable files is hashed with a perceptual hash function and compared, server side, to the hash of the content that is uploaded. Unlike cryptographic hash functions, the design details of perceptual hash functions are typically kept secret.
Several governments envisage to extend this detection to end-to-end encrypted services by using Client Side Scanning and local matching against a hashed database. In August 2021, Apple hash published a concrete design for Client Side Scanning based on the NeuralHash perceptual hash function that uses deep learning.
There has been a wide criticism of Client Side Scanning based on its disproportionate impact on human rights and risks for function creep and abuse. In addition, several authors have demonstrated that perceptual hash functions are vulnerable to cryptanalysis: it is easy to create false positives and false negatives once the design is known. This paper demonstrates that these designs are vulnerable in a weaker black-box attack model. It is demonstrated that the effective security level of NeuralHash for a realistic set of images is 32 bits rather than 96 bits, implying that finding a collision requires $2^{16}$ steps rather than $2^{48}$. As a consequence, the large scale deployment of NeuralHash would lead to a huge number of false positives, making the system unworkable. It is likely that most current perceptual hash function designs exhibit similar vulnerabilities.
VCVio: A Formally Verified Forking Lemma and Fiat-Shamir Transform, via a Flexible and Expressive Oracle Representation
As cryptographic protocols continue to become more complex and specialized, their security proofs have grown more complex as well, making manual verification of their correctness more difficult. Formal verification via proof assistants has become a popular approach to solving this, by allowing researchers to write security proofs that can be verified correct by a computer.
In this paper we present a new framework of this kind for verifying security proofs, taking a foundational approach to representing and reasoning about protocols. We implement our framework in the Lean programming language, and give a number of security proofs to demonstrate that our system is both powerful and usable, with comparable automation to similar systems.
Our framework is especially focused on reasoning about and manipulating oracle access, and we demonstrate the usefulness of this approach by implementing both a general forking lemma and a version of the Fiat-Shamir transform for sigma protocols. As a simple case study we then instantiate these to an implementation of a Schnorr-like signature scheme.
HyperPianist: Pianist with Linear-Time Prover and Logarithmic Communication Cost
Recent years have seen great improvements in zero-knowledge proofs (ZKPs). Among them, zero-knowledge SNARKs are notable for their compact and efficiently-verifiable proofs, but suffer from high prover costs. Wu et al. (Usenix Security 2018) proposed to distribute the proving task across multiple machines, and achieved significant improvements in proving time. However, existing distributed ZKP systems still have quasi-linear prover cost, and may incur a communication cost that is linear in circuit size.
In this paper, we introduce HyperPianist. Inspired by the state-of-the-art distributed ZKP system Pianist (Liu et al., S&P 2024) and the multivariate proof system HyperPlonk (Chen et al., EUROCRYPT 2023), we design a distributed multivariate polynomial interactive oracle proof (PIOP) system with a linear-time prover cost and logarithmic communication cost. Unlike Pianist, HyperPianist incurs no extra overhead in prover time or communication when applied to general (non-data-parallel) circuits. To instantiate the PIOP system, we adapt two additively-homomorphic multivariate polynomial commitment schemes, multivariate KZG (Papamanthou et al., TCC 2013) and Dory (Lee et al., TCC 2021), into the distributed setting, and get HyperPianist^K and HyperPianist^D respectively. Both systems have linear prover complexity and logarithmic communication cost; furthermore, HyperPianist^D requires no trusted setup. We also propose HyperPianist+, incorporating an optimized lookup argument based on Lasso (Setty et al., EUROCRYPT 2024) with lower prover cost.
Experiments demonstrate HyperPianist^K and HyperPianist^D achieve a speedup of 66.8\times and 44.9\times over HyperPlonk with 32 distributed machines. Compared to Pianist, HyperPianistK can be 3.2\times and 5.0\times as fast and HyperPianistD can be 2.7\times and 4.1\times as fast, on vanilla gates and custom gates respectively.
IMOK: A compact connector for non-prohibition proofs to privacy-preserving applications
This article proposes an extension for privacy-preserving applications to introduce sanctions or prohibition lists. When initiating a particular action, the user can prove, in addition to the application logic, that they are not part of the sanctions lists (one or more) without compromising sensitive data. We will show how this solution can be integrated into applications, using the example of extending Freedom Tool (a voting solution based on biometric passports). We will also consider ways to manage these lists, versioning principles, configuring the filter data set, combining different lists, and using the described method in other privacy-preserving applications.
SwiftEC: Shallue–van de Woestijne Indifferentiable Function To Elliptic Curves
Hashing arbitrary values to points on an elliptic curve is a required step in many cryptographic constructions, and a number of techniques have been proposed to do so over the years. One of the first ones was due to Shallue and van de Woestijne (ANTS-VII), and it had the interesting property of applying to essentially all elliptic curves over finite fields. It did not, however, have the desirable property of being indifferentiable from a random oracle when composed with a random oracle to the base field.
Various approaches have since been considered to overcome this limitation, starting with the foundational work of Brier et al. (CRYPTO 2011). For example, if $f\colon \mathbb{F}_q\to E(\mathbb{F}_q)$ is the Shallue--van de Woestijne (SW) map and $\mathfrak{h}_1,\mathfrak{h}_2$ are two independent random oracles to $\mathbb{F}_q$, we now know that $m\mapsto f\big(\mathfrak{h}_1(m)\big)+f\big(\mathfrak{h}_2(m)\big)$ is indifferentiable from a random oracle. Unfortunately, this approach has the drawback of being twice as expensive to compute than the straightforward, but not indifferentiable, $m\mapsto f\big(\mathfrak{h}_1(m)\big)$. Most other solutions so far have had the same issue: they are at least as costly as two base field exponentiations, whereas plain encoding maps like $f$ cost only one exponentiation. Recently, Koshelev (DCC 2022) provided the first construction of indifferentiable hashing at the cost of one exponentiation, but only for a very specific class of curves (some of those with $j$-invariant $0$), and using techniques that are unlikely to apply more broadly.
In this work, we revisit this long-standing open problem, and observe that the SW map actually fits in a one-parameter family $(f_u)_{u\in\mathbb{F}_q}$ of encodings, such that for independent random oracles $\mathfrak{h}_1, \mathfrak{h}_2$ to $\mathbb{F}_q$, $F\colon m\mapsto f_{\mathfrak{h}_2(m)}\big(\mathfrak{h}_1(m)\big)$ is indifferentiable. Moreover, on a very large class of curves (essentially those that are either of odd order or of order divisible by 4), the one-parameter family admits a rational parametrization, which let us compute $F$ at almost the same cost as small $f$, and finally achieve indifferentiable hashing to most curves with a single exponentiation.
Our new approach also yields an improved variant of the Elligator Squared technique of Tibouchi (FC 2014) that represents points of arbitrary elliptic curves as close-to-uniform random strings.
SophOMR: Improved Oblivious Message Retrieval from SIMD-Aware Homomorphic Compression
Privacy-preserving blockchains and private messaging services that ensure receiver-privacy face a significant UX challenge: each client must scan every payload posted on the public bulletin board individually to avoid missing messages intended for them. Oblivious Message Retrieval (OMR) addresses this issue by securely outsourcing this expensive scanning process to a service provider using Homomorphic Encryption (HE).
In this work, we propose a new OMR scheme that substantially improves upon the previous state-of-the-art, PerfOMR (USENIX Security'24). Our implementation demonstrates reductions of 3.3x in runtime, 2.2x in digest size, and 1.5x in key size, in a scenario with 65536 payloads (each 612 bytes), of which up to 50 are pertinent.
At the core of these improvements is a new homomorphic compression mechanism, where ciphertexts of length proportional to the number of total payloads are compressed into a digest whose length is proportional to the upper bound on the number of pertinent payloads. Unlike previous approaches, our scheme fully exploits the native homomorphic SIMD structure of the underlying HE scheme, significantly enhancing efficiency. In the setting described above, our compression scheme achieves 7.4x speedup compared to PerfOMR.
ARCHER: Architecture-Level Simulator for Side-Channel Analysis in RISC-V Processors
Side-channel attacks pose a serious risk to cryptographic implementations, particularly in embedded systems. While current methods, such as test vector leakage assessment (TVLA), can identify leakage points, they do not provide insights into their root causes. We propose ARCHER, an architecture-level tool designed to perform side-channel analysis and root cause identification for software cryptographic implementations on RISC-V processors. ARCHER has two main components: (1) Side-Channel Analysis to identify leakage using TVLA and its variants, and (2) Data Flow Analysis to track intermediate values across instructions, explaining observed leaks.
Taking the binary file of the target implementation as input, ARCHER generates interactive visualizations and a detailed report highlighting execution statistics, leakage points, and their causes. It is the first architecture-level tool tailored for the RISC-V architecture to guide the implementation of cryptographic algorithms resistant to power side-channel attacks. ARCHER is algorithm-agnostic, supports pre-silicon analysis for both high-level and assembly code, and enables efficient root cause identification. We demonstrate ARCHER’s effectiveness through case studies on AES and ASCON implementations, where it accurately traces the source of side-channel leaks.
A Variation on Knellwolf and Meier's Attack on the Knapsack Generator
Pseudo-random generators are deterministic algorithms that take in input a random secret seed and output a flow of random-looking numbers. The Knapsack generator, presented by Rueppel and Massey in 1985 is one of the many attempt at designing a pseudo-random generator that is cryptographically secure. It is based on the subset-sum problem, a variant of the Knapsack optimization problem, which is considered computationally hard.
In 2011 Simon Knellwolf et Willi Meier found a way to go around this hard problem and exhibited a weakness of this generator. In addition to be able to distinguish the outputs from the uniform distribution, they designed an algorithm that retrieves a large portion of the secret. We present here an alternate version of the attack, with similar costs, that works on a larger range of parameters and retrieves a larger portion of the secret.
Tightly-Secure Group Key Exchange with Perfect Forward Secrecy
In this work, we present a new paradigm for constructing Group Authenticated Key Exchange (GAKE). This result is the first tightly secure GAKE scheme in a strong security model that allows maximum exposure attacks (MEX) where the attacker is allowed to either reveal the secret session state or the long-term secret of all communication partners. Moreover, our protocol features the strong and realistic notion of (full) perfect forward secrecy (PFS), that allows the attacker to actively modify messages before corrupting parties.
We obtain our results via a series of tightly secure transformations. Our first transformation is from weakly secure KEMs to unilateral authenticated key exchange (UAKE) with weak forward secrecy (WFS). Next, we show how to turn this into an UAKE with PFS in the random oracle model.
Finally, and as one of our major novel conceptual contributions, we describe how to build GAKE protocols from UAKE protocols, also in the random oracle model.
We apply our transformations to obtain two practical GAKE protocols with tight security. The first is based on the DDH assumption and features low message complexity. Our second result is based on the LWE assumption. In this way, we obtain the first GAKE protocol from a post-quantum assumption that is tightly secure in a strong model of security allowing MEX attacks.
$\widetilde{\mbox{O}}$ptimal Adaptively Secure Hash-based Asynchronous Common Subset
Asynchronous multiparty computation (AMPC) requires an input agreement phase where all participants have a consistent view of the set of private inputs. While the input agreement problem can be precisely addressed by a Byzantine fault-tolerant consensus known as Asynchronous Common Subset (ACS), existing ACS constructions with potential post-quantum security have a large $\widetilde{\mathcal{O}}(n^3)$ communication complexity for a network of $n$ nodes. This poses a bottleneck for AMPC in the same setting. In contrast, ACS has optimal constructions with quadratic communication complexity based on bilinear map assumptions.
In this paper, we bridge this gap by introducing a nearly optimal ACS, which solely relies on the blackbox use of collision-resistant hash functions. It exhibits $\widetilde{\mathcal{O}}(n^2)$ communication complexity, expected constant round complexity, and security against adaptive adversaries who can corrupt up to $n/3$ nodes and perform ``after-fact-removal'' attacks.
At the core of our new ACS is the first nearly optimal hash-based Multi-valued Validated Byzantine Agreement (MVBA).
To reduce cubic communication while avoiding heavy cryptographic tools, we introduce a new design paradigm, with several new components. We define and analyze our MVBA and components within the UC-framework, facilitating their modular use in broader applications, particularly in AMPC.
Tweakable ForkCipher from Ideal Block Cipher
In ASIACRYPT 2019, Andreeva et al. introduced a new symmetric key primitive called the $\textit{forkcipher}$, designed for lightweight applications handling short messages. A forkcipher is a keyed function with a public tweak, featuring fixed-length input and fixed-length (expanding) output. They also proposed a specific forkcipher, ForkSkinny, based on the tweakable block cipher SKINNY, and its security was evaluated through cryptanalysis. Since then, several efficient AEAD and MAC schemes based on forkciphers have been proposed, catering not only to short messages but also to various purposes such as leakage resilience and cloud security. While forkciphers have proven to be efficient solutions for designing AEAD schemes, the area of forkcipher design remains unexplored, particularly the lack of provably secure forkcipher constructions.
In this work, we propose forkcipher design for various tweak lengths, based on a block cipher as the underlying primitive. We provide proofs of security for these constructions, assuming the underlying block cipher behaves as an ideal block cipher. First, we present a forkcipher, $\widetilde{\textsf{F}}1$, for an $n$-bit tweak and prove its optimal ($n$-bit) security. Next, we propose another construction, $\widetilde{\textsf{F}}2$, for a $2n$-bit tweak, also proving its optimal ($n$-bit) security. Finally, we introduce a construction, $\widetilde{\textsf{F}}r$, for a general $rn$-bit tweak, achieving $n$-bit security.
Carbon Footprint Traction System Incorporated as Blockchain
This article tries to offer a solution to an environmental sustainability problem using a forward-thinking approach and tries to construct a carbon footprint tracking system based on blockchain technology while also introducing tokenization intertwined with the blockchain to make everyday use as accessible and effective as possible.
This effort aims to provide a solid use case for environmental sustainability and lays the groundwork of a new generation social construct where carbon footprint is a valuable unit like money next to the other important tokenized attributes a person can possibly hold. The study proposes a blockchain-based solution to store the data. Through tokenization, the transacting and sharing is facilitated. As a result, carbon footprint data can be treated as a fungible utility token.
The article tries to explain how and which blockchain technology offers an effective solution to challenges in global carbon tracking systems. In this context, a use case was proposed. The critical features of the blockchain-based platform are examined. In addition, the roles of parties and user interactions within the system are detailed.
In conclusion, this article proposes the adaptation of blockchain technology together with smart contracts and tokenization to the management of carbon footprints.
Schnorr Signatures are Tightly Secure in the ROM under a Non-interactive Assumption
We show that the widely-used Schnorr signature scheme meets existential unforgeability under chosen-message attack (EUF-CMA) in the random oracle model (ROM) if the circular discrete-logarithm (CDL) assumption, a new, non-interactive and falsifiable variant of the discrete-log (DL) problem we introduce, holds in the underlying group. Notably, our reduction is tight, meaning the constructed adversary against CDL has essentially the same running time and success probability as the assumed forger. This is crucial for justifying the size of the underlying group used in practice. To our knowledge, we are the first to exhibit such a reduction. Indeed, prior work required interactive and non-falsifiable assumptions (Bellare and Dai, INDOCRYPT 2020) or additional idealized models beyond the ROM like the algebraic group model (Fuchsbauer et al., EUROCRYPT 2020). We justify CDL by showing it holds in two carefully-chosen idealized models that idealize different aspects of it. Namely, we show that CDL is as hard as DL in these models.
BatchZK: A Fully Pipelined GPU-Accelerated System for Batch Generation of Zero-Knowledge Proofs
Zero-knowledge proof (ZKP) is a cryptographic primitive that enables one party to prove the validity of a statement to other parties without disclosing any secret information. With its widespread adoption in applications such as blockchain and verifiable machine learning, the demand for generating zero-knowledge proofs has increased dramatically. In recent years, considerable efforts have been directed toward developing GPU-accelerated systems for proof generation. However, these previous systems only explored efficiently generating a single proof by reducing latency rather than batch generation to provide high throughput.
We propose a fully pipelined GPU-accelerated system for batch generation of zero-knowledge proofs. Our system has three features to improve throughput. First, we design a pipelined approach that enables each GPU thread to continuously execute its designated proof generation task without being idle. Second, our system supports recent efficient ZKP protocols with their computational modules: sum-check protocol, Merkle tree, and linear-time encoder. We customize these modules to fit our pipelined execution. Third, we adopt a dynamic loading method for the data required for proof generation, reducing the required device memory. Moreover, multi-stream technology enables the overlap of data transfers and GPU computations, reducing overhead caused by data exchanges between host and device memory.
We implement our system and evaluate it on various GPU cards. The results show that our system achieves more than 259.5× higher throughput compared to state-of-the-art GPU-accelerated systems. Moreover, we deploy our system in the verifiable machine learning application, where our system generates 9.52 proofs per second, successfully achieving sub-second proof generation for the first time in this field.
Double-Matrix: Complete Diffusion in a Single Round with (small) MDS Matrices
This paper describes a simple idea to improve (text) diffusion in block ciphers that use MDS codes but that take more than a single round to achieve full (text) diffusion. The Rijndael cipher family is used as an example since it comprises ciphers with different state sizes.
A drawback of the new approach is the additional computational cost, but it is competitive compared to large MDS matrices used in the Khazad and Kuznyechik ciphers.
Refined Strategy for Solving LWE in Two-step Mode
Learning with Errors (LWE) and its variants are widely used
in constructing lattice-based cryptographic schemes, including NIST standards Kyber and Dilithium, and a refined estimation of LWE’s hardness is crucial for their security. Currently, primal attack is considered the fastest algorithm for solving LWE problem in practice. It reduces LWE to a unique Shortest Vector Problem (uSVP) and combines lattice reduction algorithms with SVP calls such as enumeration or sieving. However, finding the most time-efficient combination strategy for these algorithms remains a challenge. The designers of Kyber highlighted this issue as open problem Q7: “A refined (progressive) lattice reduction strategy and a precise analysis of the gains using reduction preprocessing plus a single SVP call in large dimensions are still missing.” In this paper, we address this problem by presenting a Strategy Search algorithm named PSSearch for solving uSVP and LWE, using progressive BKZ as the lattice reduction and sieving as the SVP call. Compared to the heuristic strategy used in G6K (Albrechet et al., Eurocrypt 2019), the strategy generated by our algorithm has the following advantages: (1) We design a tree search algorithm with pruning named PSSearch to find the minimal time-cost strategy in two-step mode and prove its correctness, showing that the fastest approach in two-step mode for solving uSVP and LWE can be achieved in a reasonable timeframe; (2) We propose the first tight simulation for BKZ that can jump by J > 1 blocks, which allows us to choose more flexible jump values to improve reduction efficiency.
(3) We propose a refined dimension estimation method for the SVP call. We tested the accuracy of our new simulation algorithm and the efficiency of our new strategy through experiments. Furthermore, we apply the strategies generated by SSearch to solve the TU Darmstadt LWE Challenges with (n, α) ∈{(80, 0.005), (40, 0.035), (90, 0.005), (50, 0.025), (55, 0.020), (40, 0.040)} using the G6K framework, achieving improve-
ments of 7.2 to 23.4 times over the heuristic strategy employed in G6K.
By combining the minimal time-cost strategy selection with the refined two-step estimator for LWE (Xia et al., PKC 2024), we re-estimate the hardness of NIST standards Kyber and Dilithium, and determine the influence of the strategy. Specifically, the security levels of NIST standards decrease by 3.4 to 4.6 bits, rather than 2 to 8 bits indicated in the Kyber documentation. It achieves a decrease of 1.1 to 1.3 bits compared to the refined two-step estimation using trivial strategy.
Areion: Highly-Efficient Permutations and Its Applications (Extended Version)
In real-world applications, the overwhelming majority of cases require (authenticated) encryption or hashing with relatively short input, say up to 2K bytes. Almost all TCP/IP packets are 40 to 1.5K bytes, and the maximum packet lengths of major protocols, e.g., Zigbee, Bluetooth low energy, and Controller Area Network (CAN), are less than 128 bytes. However, existing schemes are not well optimized for short input. To bridge the gap between real-world needs (in the future) and limited performances of state-of-the-art hash functions and authenticated encryptions with associated data (AEADs) for short input, we design a family of wide-block permutations Areion that fully leverages the power of AES instructions, which are widely deployed in many devices. As for its applications, we propose several hash functions and AEADs. Areion significantly outperforms existing schemes for short input and even competitive to relatively long messages. Indeed, our hash function is surprisingly fast, and its performance is less than three cycles/byte in the latest Intel architecture for any message size. It is significantly much faster than existing state-of-the-art schemes for short messages up to around 100 bytes, which are the most widely-used input size in real-world applications, on both the latest CPU architectures (IceLake, Tiger Lake, and Alder Lake) and mobile platforms (Pixel 7, iPhone 14, and iPad Pro with Apple M2).
Fully Encrypted Machine Learning Protocol using Functional Encryption
As privacy concerns have arisen in machine learning, privacy-preserving machine learning (PPML) has received significant attention. Fully homomorphic encryption (FHE) and secure multi-party computation (MPC) are representative building blocks for PPML. However, in PPML protocols based on FHE and MPC, interaction between the client (who provides encrypted input data) and the evaluator (who performs the computation) is essential to obtain the final result in plaintext.
Functional encryption (FE) is a promising candidate to remove this constraint, but existing FE-based PPML protocols are restricted to evaluating only simple ML models, such as one-layer neural networks, or they support partially encrypted PPML, which makes them vulnerable to information leakage beyond the inference results.
In this paper, we propose a fully encrypted FE-based PPML protocol, which supports the evaluation of arbitrary functions over encrypted data with no information leakage during computation, for the first time.
To achieve this, we newly construct a vector functional encryption scheme for quadratic polynomials and combine it with an inner product encryption scheme. This enables multiple compositions of quadratic polynomials to compute arbitrary complex functions in an encrypted manner.
Our FE-based PPML protocol is secure in the malicious model, which means that an adversary cannot obtain any information about the input data even though they intentionally deviate from the protocol.
We then show how to use our protocol to build a fully encrypted 2-layer neural network model with quadratic activation functions and present experimental results.
A New Method to Test the Zeros of Riemann Zeta Function
The zeta function $\zeta(z)=\sum_{n=1}^{\infty} \frac{1}{n^z}$ is convergent only for $\text{Re}(z)>1$. To test its zeros, one needs to use the Riemann-Siegel function $Z(t)$. If $Z(t_1)$ and $Z(t_2)$ have opposite signs, $Z(t)$ vanishes between $t_1$ and $t_2$, and $\zeta(z)$ has a zero on the critical line between $\frac{1}{2}+it_1$ and $\frac{1}{2}+it_2$. This method is non-polynomial time, because it has to compute the sum $\sum_{n\leq \alpha}\frac{\cos(\vartheta(1/2+it)-t\log{n})}{\sqrt{n}}$, where $\alpha=\lfloor\sqrt{t/(2\pi)}\rfloor$. The eta function $\eta(z)=\sum_{n=1}^{\infty}\frac{(-1)^{n-1}}{n^z}$ is convergent for $\text{Re}(z)>0$, and $\eta(z)=\left(1-2^{1-z}\right)\zeta(z)$ for the critical strip $0<\text{Re}(z)<1$. The alternating series can be directly used to test the zeros because $\eta(z)$ and the analytic continuation of $\zeta(z)$ have the same zeros in the critical strip. In this paper, we present a polynomial time algorithm to test the zeros based on $\eta(z)$, which is more understandable and suitable for modern computing machines than the general method. Besides, we clarify the actual meaning of logarithm symbol in the Riemann-Siegel formula.
(In)Security of Threshold Fully Homomorphic Encryption based on Shamir Secret Sharing
Boneh et al. (CRYPTO'18) proposed two $t$-out-of-$N$ threshold fully homomorphic encryption ($\sf TFHE$) schemes based on Shamir secret sharing scheme and $\{0,1\}$-linear secret sharing scheme. They demonstrated the simulation security, ensuring no information leakage during partial or final decryption. This breakthrough allows any scheme to be converted into a threshold scheme by using $\sf TFHE$.
We propose two polynomial time algorithms to break the simulation security of $t$-out-of-$N$ $\sf TFHE$ based on Shamir secret sharing scheme proposed by Boneh et al.. First, we show that an adversary can break the simulation security by recovering the secret key under some constraints on $t$ and $N$, which does not violate the conditions for security proof. Next, we introduce a straightforward fix that theoretically satisfies the simulation security. However, we argue that this modification remains insecure insecure when implemented with any state-of-the-art fully homomorphic encryption libraries in practice.
To ensure robustness against our subsequent attacks, we recommend using an error-refreshing algorithm, such as bootstrapping or modulus switching, for each addition operation.
Rotatable Zero Knowledge Sets: Post Compromise Secure Auditable Dictionaries with application to Key Transparency
Key Transparency (KT) systems allow end-to-end encrypted service providers (messaging, calls, etc.) to maintain an auditable directory of their users’ public keys, producing proofs that all participants have a consistent view of those keys, and allowing each user to check updates to their own keys. KT has lately received a lot of attention, in particular its privacy preserving variants, which also ensure that users and auditors do not learn anything beyond what is necessary to use the service and keep the service provider accountable.
Abstractly, the problem of building such systems reduces to constructing so-called append-only Zero-Knowledge Sets (aZKS). Unfortunately, existing aZKS (and KT) solutions do not allow to adequately restore the privacy guarantees after a server compromise, a form of Post-Compromise Security (PCS), while maintaining the auditability properties. In this work we address this concern through the formalization of an extension of aZKS called Rotatable ZKS (RZKS). In addition to providing PCS, our notion of RZKS has several other attractive features, such as a stronger (extractable) soundness notion, and the ability for a communication party with out-of-date data to efficiently “catch up” to the current epoch while ensuring that the server did not erase any of the past data.
Of independent interest, we also introduce a new primitive called a Rotatable Verifiable Random Function (VRF), and show how to build RZKS in a modular fashion from a rotatable VRF, ordered accumulator, and append-only vector commitment schemes.
Khatam: Reducing the Communication Complexity of Code-Based SNARKs
We prove that Basefold(Crypto 2024) is secure in the $\textit{list decoding regime}$, within the double Johnson bound and with error probability $\frac{O(n)}{|F|}$. At the heart of this proof is a new, stronger statement for $\textit{correlated agreement}$, which roughly states that if a linear combination of vectors $\pi_L + r \pi_R$ agrees with a codeword at every element in $S \subset [n]$, then so do $\pi_L, \pi_R$. Our result is purely combinatorial and therefore extends to any finite field and any linear code. As such, it can be applied to any coding-based multilinear Polynomial Commitment Scheme to reduce its communication complexity.
OpenNTT: An Automated Toolchain for Compiling High-Performance NTT Accelerators in FHE
Modern cryptographic techniques such as fully homomorphic encryption (FHE) have recently gained broad attention. Most of these cryptosystems rely on lattice problems wherein polynomial multiplication forms the computational bottleneck. A popular method to accelerate these polynomial multiplications is the Number-Theoretic Transformation (NTT). Recent works aim to improve the practical deployability of NTT and propose toolchains supporting the NTT hardware accelerator design processes. However, existing design tools do not provide on-the-fly twiddle factor generation (TFG) which leads to high memory demands. Inspired by this situation, we present OpenNTT, a fully automated, open-source framework to compile NTT hardware accelerators with TFG for various NTT types and parameter sets. We address the challenge of combining conflict-free memory accesses and efficient, linear twiddle factor generation through a dedicated NTT processing order. Following this order, we develop a flexible twiddle factor generation method with minimal memory usage. These core concepts together with a frequency-optimized hardware architecture form our OpenNTT framework. We use OpenNTT to compile and test NTT hardware designs with various parameter sets on FPGAs. The obtained results show a clear memory reduction due to TFG and a speedup by 2.7× in latency and 2.2× in area-time-product, compared to prior arts.
"There's always another counter": Detecting Micro-architectural Attacks in a Probabilistically Interleaved Malicious/Benign Setting
Modern micro-architectural attacks use a variety of building blocks chained to develop a final exploit. However, since in most cases, the footprint of such attacks is not visible architecturally (like, in the file-system), it becomes trickier to defend against these. In light of this, several automated defence mechanisms use Hardware Performance Counters (HPCs) detect when the micro-architectural elements are being misused for a potential attacks (like flush-reload, Spectre, Meltdown etc.). In order to bypass such defences, recent works have proposed the idea of "probabilistic interleaving": the adversary interleaves the actual attack code with benign code with very low frequency. Such a strategy tips off the HPCs used for detection with a lot of unnecessary noise; recent studies have shown that probabilistically interleaved attacks can achieve an attack evasion rate of 100% (i.e. are virtually undetectable).
In this work, we contend this folklore. We develop a theoretical model of interleaved attacks using lightweight statistical tools like Gaussian Mixture Models and Dip Test for Unimodality and prove they are detectable for the correct choices of HPCs. Furthermore, we also show possible defence strategy against a stronger threat model than considered in literature: where the attacker interleaves multiple attacks instead of a single attack. Empirically, to instantiate our detector, in contrast to prior detection strategies, we choose LLMs for a number of reasons: (1) LLMs can easily contextualize data from a larger set of HPCs than generic machine learning techniques, and (2) with simple prompts, LLMs can quickly switch between different statistical analysis methods. To this end, we develop an LLM-based methodology to detect probabilistically interleaved attacks. Our experiments establish that our improved methodology is able to achieve 100% speculative attacks like Spectre v1/v2/v3, Meltdown, and Spectre v2 (with improved gadgets that even evade recent protections like Enhanced IBRS, IBPB conditional, and so on). This makes our methodology suitable for detecting speculative attacks in a non-profiled setting: where attack signatures might not be known in advance. All in all, we achieve a 100% attack detection rate, even with very low interleave frequencies (i.e. $10^{-6}$).
Our detection principle and its instantiation through LLMs shows how probabilistically interleaving attack code in benign execution is not a perfect strategy, and more research is still needed into developing and countering better attack evasion strategies.
MixBuy: Contingent Payment in the Presence of Coin Mixers
A contingent payment protocol involves two mutually distrustful parties, a buyer and a seller, operating on the same blockchain, and a digital product, whose ownership is not tracked on a blockchain (e.g. a digital book). The buyer holds coins on the blockchain and transfers them to the seller in exchange for the product. However, if the blockchain does not hide transaction details, any observer can learn that a buyer purchased some product from a seller.
In this work, we take contingent payment a step further: we consider a buyer who wishes to buy a digital product from a seller routing the payment via an untrusted mixer.
Crucially, we require that said payment is unlinkable, meaning that the mixer (or any other observer) does not learn which buyer is paying which seller. We refer to such setting as unlinkable contingent payment (UCP).
We present MixBuy, a system that realizes UCP. Mixbuy relies on oracle-based unlinkable contingent payment (O-UCP), a novel four-party cryptographic protocol where the mixer pays the seller and the seller provides the buyer with the product only if a semi-trusted notary attests that the buyer has paid the mixer. More specifically, we require four security notions: (i) mixer security that guarantees that if the mixer pays the seller, the mixer must get paid from the buyer; (ii) seller security that guarantees that if the seller delivers the product to the buyer, the seller must get paid from the mixer; (iii) buyer security that guarantees that if the buyer pays the mixer, the buyer must obtain the product; and (iv) unlinkability that guarantees that given a set of buyers and sellers, the mixer should not learn which buyer paid which seller.
We present a provably secure and efficient cryptographic construction for O-UCP. Our construction can be readily used to realize UCP on most blockchains, as it has minimal functionality requirements (i.e., digital signatures and timelocks). To demonstrate the practicality of our construction, we provide a proof of concept for O-UCP and our benchmarks in commodity hardware show that the communication overhead is small (a few kB per message) and the running time is below one second.
Lova: A Novel Framework for Verifying Mathematical Proofs with Incrementally Verifiable Computation
Efficiently verifying mathematical proofs and computations has been a heavily researched topic within Computer Science. Particularly, even repetitive steps within a proof become much more complex and inefficient to validate as proof sizes grow. To solve this problem, we suggest viewing it through the lens of Incrementally Verifiable Computation (IVC). However, many IVC methods, including the state-of-the-art Nova recursive SNARKs, require proofs to be linear and for each proof step to be identical. This paper proposes Lova, a novel framework to verify mathematical proofs end-to-end that solves these problems. Particularly, our approach achieves a few novelties alongside the first-of-its-kind implementation of Nova: (i) an innovative proof splicing mechanism to generate independent proof sequences, (ii) a system of linear algorithms to verify a variety of mathematical logic rules, and (iii) a novel multiplexing circuit allowing non-homogeneous proof sequences to be verified together in a single Nova proof. The resulting Lova pipeline has linear prover time, constant verifying capability, dynamic/easy modification, and optional zero-knowledge privacy to efficiently validate mathematical proofs. Code is available at https://github.com/noelkelias/lova.
A Zero-Knowledge PCP Theorem
We show that for every polynomial q∗ there exist polynomial-size, constant-query, non-adaptive PCPs
for NP which are perfect zero knowledge against (adaptive) adversaries making at most q∗ queries to
the proof. In addition, we construct exponential-size constant-query PCPs for NEXP with perfect zero
knowledge against any polynomial-time adversary. This improves upon both a recent construction of
perfect zero-knowledge PCPs for #P (STOC 2024) and the seminal work of Kilian, Petrank and Tardos
(STOC 1997).