On the discrete logarithm problem for prime-field elliptic curves

In recent years several papers have appeared investigating the classical discrete logarithm problem for elliptic curves by means of the multivariate polynomial approach based on the celebrated summation polynomials, introduced by Se-maev in 2004. However, with a notable exception by Petit et al. in 2016, all numerous papers have investigated only the composite-ﬁeld case, leaving apart the laborious prime-ﬁeld case. In this paper we propose a variation of Semaev’s original approach for the prime-ﬁeld case. Our proposal outperforms both the original Semaev’s method and Petit et al. specialized algorithm. The improvement is reached by reducing the necessary Groebner basis computations to only one basis calculation.


Introduction
Several cryptographic schemes base their security upon the hardness of the discrete logarithm problem for elliptic curves (ECDLP) [12,14].For an elliptic curve E defined over a finite field F q , an instance of the ECDLP is the following: given P, Q ∈ E(F q ), compute an integer w, if it exists, s.t.Q = wP .
The best known algorithms for the ECDLP are algorithms that work on arbitrary cyclic groups -like Pollard's Rho algorithm [18], which runs in time O( |E(F q )|) if |E(F q )| is prime -exception made for algorithms that are specific for some families of weak curves (as, for example, [13]).
In 2004 I. Semaev introduced [19] a family of polynomials, named summation polynomials, proposing their exploitation for an index calculus algorithm for elliptic curves.The Index Calculus is originally a subexponential algorithm to compute discrete logarithms in the multiplicative groups of finite fields.However, This work was supported by CARITRO Foundation it is customary to use the name index calculus algorithm to refer to any algorithm that computes discrete logarithms in a cyclic group G by collecting linear relations first and using linear algebra afterwards.Following [6] and restricting to the case G = E(F q ), with r = |E(F q )| a prime integer, the simplest version of index calculus algorithm consists of two main steps: the relation collection step and the linear algebra step.In the relation collection step: 1. a factor base F ⊂ E(F q ) is defined; 2. for random integers u, v, the point R = uP + vQ is computed; 3. if possible, R is written as a sum of multiples of points of F: with the integers F 's ranging in a small coefficient set; 4. u, v and the vector ( F ) are stored as a row of a matrix M ; 5. the procedure from item 2 to item 4 is repeated until at least |F| points R as in (1) are found.
After the collection of a large enough number of relations, the linear algebra step solves the discrete logarithm problem: 1. using linear algebra on M , a linear dependency of points R is computed, obtaining the relation λP + µQ = ∞ with λ, µ ∈ Z; 2. w is recovered from the linear congruence λ + µw = 0 (mod r), which is solvable except the extremely unlikely case when µ = 0.
The complexity of an index calculus algorithm mainly lies in the decomposition of a point R as sum of multiples of points in F, usually known as the point decomposition problem (PDP).The solution of the PDP must be efficient (including the decision on the decomposability of a point R) and with a high success rate.Both these features are deeply affected by how F is defined and by its size.
In the third section of Semaev's paper [19], he proposed to reduce the PDP to the problem of finding specific solutions of a multivariate polynomial equation deduced from summation polynomials.Semaev sketched his proposal in the case of elliptic curves defined over prime fields, suggesting to define F as the set of rational points whose x-coordinates are small when taken as integers in the standard complete residue system [0, . . ., p − 1].Furthermore, in Remark 2 of [19], Semaev also outlined a variation of his proposal to binary elliptic curves defined over extension fields.This variation involves the Weil descent, which cannot be applied to prime fields.
Semaev's paper triggered a deep interest in summation polynomials.Starting from them, Diem claimed that the ECDLP for some elliptic curves could be solved with a subexponential-time complexity (rather than exponential) in his 2006 talk [2].However, the first papers developing Semaev's approach appeared only in 2009, by Gaudry [8], and then in 2011 by Diem [1].In [8] Gaudry shows that Semaev's proposal can be used to solve, in heuristic time Õ(q 2− 2 n ), the ECDLP for elliptic curves defined over fields of size q = q n (from now on, ECDLP(q,n)), with q prime or a prime power and n ≥ 2. However, Gaudry's results cannot be extended to the prime-field case.Diem [1] proved that the same problem can be solved in an expected time of e O(max(log q,n 2 )) .Unfortunately, when q = 2 and n is large (one of the two common cases in leading cryptographic standards), the above works do not lead to an algorithm more effective than Pollard's Rho algorithm.In 2012 Faugère, Perret, Petit and Renault [5] claimed that, under a heuristic assumption, the ECDLP over any binary field F 2 n can be solved in time O(2 ωt ), with t ≈ n/2 and where 2.376 ≤ ω ≤ 3 is the linear algebra constant.Petit and Quisquater [16] revisited polynomial systems proposed by Faugère et al. in [5], introducing a heuristic assumption, named first fall degree assumption, under which the ECDLP(2,n) can be solved in time O(2 cn 2/3 log n ), where c is a constant smaller than 2. Further relevant results were then accomplished exploiting symmetries in the works of Huang, Petit, Shinohara and Takagi [9], of Faugère, Gaudry, Huot and Renault [4], and of Galbraith and Gebregiyorgis [7].In the latter paper, the authors propose to use disjoint factor basis F i and require P i ∈ F i in (1).In 2015, Semaev [20] and Karabina [11] had independently the idea to lower the degrees of polynomial equation systems, appearing in the PDP, at the cost of a larger number of variables.Both [20] and [11] provided experimental results suggesting that their idea can lead to smaller running times.
There is no doubt that the listed results (and many others not cited here) determined a huge improvement over the index calculus algorithms for elliptic curves.However, the question whether the index calculus algorithm proposed by Semaev in 2004 can lead to a subexponential algorithm for binary elliptic curves is nowadays still open.
Even if the original proposal in Semaev's paper directly considered the case of elliptic curves defined over prime fields, only one recent paper ( [17], 2016) takes such demanding case into account.Among the main obstacles that one may encounter when facing this case, it is worth mentioning the difficulty to endow F with an algebraic structure, and the impossibility to exploit the Weil descent in the PDP.Nevertheless, elliptic curves defined over prime fields are nowadays widely spread in cryptographic applications, as for example the Bitcoin system [15].In [17] Petit et al. deal with the identification of a suitable F by introducing rational (or algebraic) maps L, obtained by a preprocessing specific for a given curve E, and defining F as the set of rational points In this paper we propose a new variant of index calculus algorithm for the prime-field case.At the beginning of our algorithm (item 1 in the relation collection step), we define the size of F but not F itself, which instead will be defined on the run.Indeed, every point R produced in item 2 in the relation collection step is appended to F, giving a trivial decomposition of R in item 3. We proceed in this fashion until F contains the predetermined size.At this point a linear dependency among points in F is obtained solving a point decomposition problem by means of summation polynomials.Consequently, our proposal reduces the index calculus algorithm for elliptic curves over prime fields to the solution of a single multivariate polynomial equation system.We compare the complexity of our algorithm with that by Petit et al. showing that we improve the complexity of the algorithm in [17].Nevertheless, more work is needed in order to reach a possible enhancement on Pollard's Rho algorithm.We hope that our proposal can encourage further discussions on the prime-field case which has been somewhat neglected so far.
The paper is organized as follows.In Section 2 we provide some background on summation polynomials and the relevant index calculus algorithm for elliptic curves.In Section 3 we introduce our index calculus algorithm for the prime-field case, providing some complexity evaluations and experimental results.Section 4 concludes the paper with comments and future perspectives.

Summation polynomials and the index calculus algorithm for elliptic curves
In 2004 Semaev introduced summation polynomials in order to use them to build an index calculus algorithm for elliptic curves.Let K be a field of any characteristic.Given an elliptic curve E defined over K, for each m ≥ 2 the m-th summation polynomial S m is defined as follows: Definition 1 Let K be the algebraic closure of the field K.For any integer m ≥ 2, the m-th summation polynomial S m is an element of K[X 1 , . . ., X m ] and it is such that, given x 1 , . . ., x m ∈ K, then S m (x 1 , . . ., x m ) = 0 if and only if there exist y 1 , . . ., y m ∈ K for which (x 1 , y 1 ), . . ., (x m , y m ) ∈ E(K) and The existence of summation polynomials is proved by providing recursive formulae along the following properties.
Theorem 2 ( [19]) Let E be an elliptic curve defined over a field K, with char(K) = 2, 3, by the Weierstrass equation The summation polynomials for E are given as follows: and, for all m ≥ 4 and 1 ≤ k ≤ m − 3, it holds where Res X denotes the resultant polynomial of S m−k and S k+2 with respect to the variable X.Moreover, the polynomial S m is symmetric of degree 2 m−2 in each variable X i , it is absolutely irreducible and for any m ≥ 3.
In the case of elliptic curves defined over fields of characteristics 2 or 3, it turns out that summation polynomials exist and have a similar form (and the same properties, see [19], [1]).

Index calculus algorithm for elliptic curves
In [19], Semaev proposed to use summation polynomials for an index calculus algorithm solving the discrete logarithm problem for elliptic curves.Semaev's idea was firstly developed by Gaudry [8] and Diem [1], and then many other papers on this topic followed.We describe here the current approach to these ideas.As we pointed out previously, all works subsequent to [19], except one, take into account only elliptic curves defined over extension fields.
Consider an elliptic curve E defined over a finite field F q n , where q is a prime or a prime power.Let P ∈ E(F q n ) \ {∞} and Q ∈< P >, where < P > denotes the cyclic group generated by P .It is standard to restrict attention to points P of prime order r.Concerning the relation collection step, the general procedure is the following: 1.An F q -vector subspace V of F q n is fixed, with #V = q where 1 ≤ < n.Then the factor base F is defined as: where x(F ) denotes the x-coordinate of F . 2. Two random integers u, v are generated and the point R = uP + vQ is computed.
3. For a fixed integer m, points P i ∈ F with i ∈ {1, . . ., m} and such that are searched.Such P i 's are obtained by solving the polynomial equation At this point the integer w such that Q = wP can be computed by executing the linear algebra step: 1. via Gauss' elimination, a linear dependency among some M 's rows is obtained, and then its corresponding linear dependency of points R This leads to the sought-after relation λP with λ, µ ∈ Z.
2. w is recovered from the modular equation λ + µw = 0 (mod r) if it is solvable.
As observed, while considering elliptic curves defined over extension fields, the subset V ⊂ F q n is usually chosen as an F q -vector subspace of F q n .Consequently, the Weil descent can be exploited to replace the system of the PDP with an equivalent one, which have polynomial equations over F q .Systems that arise are then usually solved with Groebner basis methods [5], [16], [10], [21], [4].

The prime-field case
The case when the considered elliptic curve E is defined over a prime field F p was originally taken into account by Semaev in [19], where he defines V as the subset {x ∈ F p | x < p 1/m+δ } for some small δ > 0. However, Semaev did not provide an algorithm to solve the systems arising in the point decomposition problem and this case was somewhat neglected so far.
In 2016, Petit et al. were the first, after Semaev, to face the prime-field case.In [17], they primarily observe that the constraints x i ∈ V of item 3 in the above relation collection step are equivalent to the constraints L(X i ) = 0 where L(z) is a polynomial defined as: This observation can be exploited to transform Semaev's prime-field proposal into an actual algorithm.Furthermore, Petit et al. generalise it considering suitable algebraic (or rational) maps L over F p (i.e.maps L that are composition of low degrees maps L 1 , . . ., L t ) and defining V as the set {x ∈ F p | L(x) = 0}.Accordingly, the factor base F is then defined as {F ∈ E(F p ) | x(F ) ∈ V }, where x(F ) is the abscissa of F .To obtain a suitable map L, they introduce a pre-computation, which is specific for every elliptic curve E.
The index calculus algorithm considered by Petit et al. to solve the ECDLP in the prime-field case is the same described in the previous section.However, given a point R = uP + vQ ∈ E(F p ), in order to solve the point decomposition problem for R with respect to F, they suggest to solve the following polynomial equation system: where x(R) is the x-coordinate of R. In their work, Petit et al. provide some experimental results and partial complexity analysis, with which we are going to compare.

Our proposal
In this section we introduce a new variant, for elliptic curves defined over prime fields, of the index calculus algorithm based on summation polynomials.Our proposal differs significantly from the algorithm reported in Section 2.2, since after the relation collection step we do not have a linear algebra step, but rather a relation solving step, as we detail in the following.
Let us consider an elliptic curve E defined over a prime field F p and having a prime number of rational points, i.e. #E(F p ) = r with r prime integer.With P and Q we denote again two points of E(F p ) such that Q = wP .
Relation collection step 1.The factor base F is not fixed, rather it is initialized as the empty set.The final size s of F is fixed together with an integer m, named decomposition constant.As usual (see for example [17, pag.7]) we take s m ≈ p. 2. For random integers u, v, the point R = uP + vQ is computed and added to the factor base F. 3. R does not need to be written as a sum of multiples of points in F, since R itself is a point of F. 4. The procedure in item 2 is repeated until at least s different points are added to F.
Relation solving step 1.Using summation polynomials, a linear dependency among points in F is determined, but not with linear algebra.To be more precise, we consider the multivariate polynomial equation S m (X 1 , . . ., X m ) = 0, where S m is the m-th summation polynomial for E, and we search solutions of the form (x 1 , . . ., x m ) ∈ V m where: Once that such a solution has been found, it is trivial to determine m points P i such that −P i ∈ F or P i ∈ F, and Hence, the linear congruence λP + µQ = ∞, with λ, µ ∈ Z, is deduced.
2. The discrete logarithm w of Q with respect to P is recovered solving the modular equation λ + µw = 0 (mod r).
The algorithm described above differs in two key points from the standard index calculus algorithm usually considered when exploiting summation polynomials.First, F is not fixed deterministically at the beginning of the algorithm.Instead, it is constructed step by step adding the random points R progressively.This prevents the computation of many relations (i.e resolutions of the PDP), which traditionally requires expensive Grobner basis computations.Second, in our second step (which is traditionally the linear algebra step) we actually do not execute linear algebra.On the other hand, the linear dependency among points R is obtained by looking for suitable solutions of the equation S m (X 1 , . . ., X m ) = 0. To do that, we do need to compute one Groebner basis, but it is just one.We now specify which is the ideal basis that we use for our Groebner basis computation.

The system to be solved
In order to deduce a linear relation among points in F we solve the polynomial equation S m (x 1 , . . ., x m ) = 0 considering only solutions of the form (x 1 , . . ., x m ) ∈ V m .Formally, this is reached solving the polynomial equation system where f (z) is the polynomial generating the vanishing ideal of V ⊂ F p .In particular: Although we would be content with any solution of S m (X 1 , . . ., X m ) = 0 of the form (x 1 , . . ., x m ) ∈ V m , we add further constraints on the corresponding variables X 1 , . . ., X m in order to lower the degrees in system (9).Indeed, given any partition {F i | i = 1, . . ., m} of F, the sets can be trivially deduced and the following polynomials constructed: The idea of partitioning F is similar to what Galbraith and Gebregiyorgis proposed in [7] and it allows us to limit the degrees of the univariate polynomials in X 1 , . . ., X m .Therefore, the system to be solved in item 1 of the Relation solving step of our algorithm is the following: The constraints on X 1 , . . ., X m of system (11) break the simmetry, since m disjoint factor bases are used and only P i ∈ F i (or −P i ∈ F i ) are then allowed in (8).This will obviously impact the probability of finding a solution of system (11): a complexity analysis is reported in the next subsection.

Complexity analysis
The whole complexity of the proposed algorithm relies on the solution of a single polynomial equation system.Let T (E, m, V ) be the computational cost of solving the system (11).Under the assumption that the sizes of the F i 's are about the same (which is in practice what we always enforce), the expected number of linear combinations Then the probability that a random point of E(F p ) could be written as a sum Therefore, the total cost of the algorithm is given by: p m m s m T (E, m, V ).
In order to reach an improvement with respect to algorithms for generic cyclic groups, this complexity should be smaller than p 1/2 .We do not claim this result.Indeed our complexity analysis is partial, since we are unable to estimate the complexity of solving our polynomial equation system.We compare our complexity with that of the algorithm of Petit et al. which, following the notation of our paper, can be expressed as: where T (E, m, V ) is the cost to solve one of the systems proposed in [17], which are of the form of system (6).We can assume As a consequence, for a fixed m our algorithm outperforms that of Petit et al. when p increases.In particular, the outperforming happens when Considering m ∈ {3, 4, . . ., 8} (S 8 is nowadays the summation polynomial with the highest index that has been computed [3]), the following table shows from which bit size of p our algorithm would outrun that in [17].
m 3 4 5 6 7 8 bit size 7 14 24 37 52 70 We underline that the bit sizes of p reported in the above table are all smaller than the bit sizes used in cryptographic applications.
The systems arising in our algorithm need to be analyzed more deeply in order to reach a more precise complexity evaluation and it should be also inspected whether algorithms different from Grobner basis methods are more suitable to tackle such systems.Hence, proving theoretically the assumption T (E, m, V ) ≈ T (E, m, V ) is a challenging issue.However, the experimental results reported in next section show that the assumption has heuristic evidence.

Experimental results
We report some computer experiments, executed on elliptic curves defined over prime fields F p with the bit size of p lying in {11, 12, . . ., 22}.The possible values for the decomposition constant m that we considered belong to the set {3, 4, 5}.Following [11] and [20], for m = 4 we substituted S 4 (X 1 , X 2 , X 3 , X 4 ) = 0 with introducing the variable Y and obtaining the following system in place of (11): Similarly, for m = 5 the equation S 5 (X 1 , X 2 , X 3 , X 4 , X 5 ) = 0 was substituted with introducing the variable Y and obtaining the following system in place of (11): For every possible values of m and the bit size of p, we considered ten different elliptic curves.For each of them, we use our algorithm to solve ten instances of the discrete logarithm problem, taking the average execution time.From the ten average times obtained (one for every elliptic curve), we considered the worst one.T (E, m, V ) ≈ T (E, m, V ) that we considered in the complexity analysis.However, our results concern the whole resolution of the discrete logarithm problem (as executing our algorithm just one system needs to be solved), while Petit et al. results concern the time needed to tackle one out of s systems that need to be solved in the execution of their algorithm.So the times reported in the last column should be multiplied by s to obtain the time needed by Petit et al. algorithm to solve the whole discrete logarithm problem.For example, if p has a bit size equal to 11, our algorithm is at least 10 times faster than their.The same remark holds considering Semaev's original proposal in [19].
We perfomed many preliminary computations with several monomial orderings available in MAGMA.Since we did not find any which would clearly outperforms the others with our systems, we decided to use only the "graded reverse lexicographic order" for the computations reported in the table.

Conclusions
We have presented a new index calculus algorithm that exploits summation polynomials for solving the discrete logarithm problem in elliptic curves defined over prime fields.This algorithm significantly differs from the algorithm, for the same case, proposed by Petit et al. at the beginning at 2016 and it reduces to one the number of polynomial equation systems to be solved during its execution.A preliminary complexity analysis suggests that our algorithm improves the one of Petit et al. when m is fixed and p increases.However, the complexity of the single system to be tackled in our algorithm is far to be well understood and it needs to be deeply analyzed.Furthermore, it should be investigated whether other algorithms different from Grobner basis algorithms are better to solve such systems.

4 )
and considering solutions of the form (x 1 , . . ., x m ) ∈ V m .Hence the Point Decomposition Problem is reduced to the solution of a multivariate polynomial equation thanks to summation polynomials.If R is not decomposable as a sum of points of F, it is replaced by a new R. 4. To every x ∈ V , one point in F having x as abscissa is chosen.Consequently, R can be written as R = ±P 1 ±• • •±P m , where P i ∈ F and with signs depending on the selected points.A matrix M with |V | + 2 columns is then constructed: each of the first |V | columns corresponds to a single element of V , while the two last columns correspond to the coefficients u and v, respectively.Given the relation R = ±P 1 ± • • • ± P m , a row of M is filled with the coefficients of the chosen points and the coefficients u, v such that R = uP + vQ. 5.The three steps above are repeated until at least |V | relations have been computed, i.e. at least |V | rows of M have been filled.
[17]experiments were performed with MAGMA on a CPU with an Intel Xeon Process 5460 at 3.16 GHz with a cache of 6 MB.The collected data are showed in the following table, together with an extra column with the experimental results reported in[17]and concerning the average time necessary to solve one of the systems arising in Petit et al. algorithm.The data for even bit sizes are relatively to primes of a specific form; the ones for odd bite sizes are relatively to generic primes.All of them were obtained exploiting a computational power similar to ours.Comparing our results and those of Petit et al. for the case m = 3, it is easy to observe that they are quite similar, heuristically corroborating the assumption bit size m = 3 m = 4 m = 5 Petit et al. (m =