NSA Backdoor
Last updated
Was this helpful?
Last updated
Was this helpful?
I heard someone has been sneakily installing backdoors in open-source implementations of Diffie-Hellman... I wonder who it could be... ;)
This challenge was very difficult for me and took over 10 hours. This is likely the intended method. Scroll down for a much simpler approach.
The script is very similar to the script provided in the challenge. The largest difference is that c = pow(3, FLAG, n)
is used instead of c = pow(FLAG, e, n)
(where e = 3
). Even though only one value was swapped, this is a substantial change that changes the encryption algorithm from RSA to Diffie-Hellman. So, instead of solving the discrete d-th root problem (RSA), we are solving the discrete logarithm problem (Diffie-Hellman). of the major difference between RSA and Diffie-Hellman.
There is one hint for this challenge and it reads "Look for Mr. Wong's whitepaper... His work has helped so many cats!" Searching online for "Wong NSA Backdoor" finds (). Searching for "diffie-hellman David Wong" finds (this is David Wong's website), which links to of a talk by David Wong explaining his paper. The paper and talk both link to on GitHub, which is a repo containing the code for this backdoor exploit.
To solve this challenge, I read the paper multiple times, watched the presentation, and looked at a variety of miscellaneous resources on the internet. So, I suggest you also read the paper and watch the presentation to gain a deep understanding of this challenge.
The paper works up to an attack called "Composite Modulus with B-Smooth Order (CM-HSO)," which is what we will be using in this challenge. In order for the CM-HSO backdoor to be a NOBUS (Nobody-But-Us) backdoor, the composite modulus $n = pq$ must be large enough such that it is not susceptible to factoring with (just like in RSA). To quote the paper, this method's "security also relies on the RSA’s assumption that factoring n is difficult if n is large enough." In the paper and code, "as a way of countering Pollard's p-1 we can add a large factor to both $p-1$ and $q-1$ that we will call $p_{big}$ and $q_{big}$ respectively." However, as we saw in , the public modulus is factorable with Pollard's p-1 algorithm since these large factors are not used and because are used in the script. Thus, we can use the same script as in to factor $n$. Running the script with the given n
produces the following values for p
and q
:
Note that is another implementation of Pollard's p-1 that you can try, but I found , which I use in , to be better.
According to the paper, "Since $p-1$ and $q-1$ are both B-smooth, they are susceptible to be factored with the Pollard's p-1 factorization algorithm, a factorization algorithm that can find a factor $p$ of $n$ if $p-1$ is partially-smooth."
Next, we need to compute $x$ where $g^x\equiv c\pmod n$ (can also we written as $g^x\bmod n=c$ since $c<n$). We know all variables except $x$ and we know that $n=p*q$.
In the folder of the there is a called backdoor_generator.sage
and a called backdoor_generator_tests.sage
. The testing script sounds the most interesting because it shows off the exploit versus the generator script shows how to set up the exploit.
This outputs the following:
Thus, we can see that the graphic in the paper right below the previously quoted text accurately displays the situation. p-1
and q-1
each factor into 2, a small prime, and a large prime. However, if we take our value for p-1
and get it's prime factors by running the following command:
This produces a list of 67 different factors. Therefore, our values for the challenge cannot be broken using this method and we will need to use CM-HSO. Looking at the examples in the test_CM_HSO
function in the test script, we see that a list of factors is provided, which is exactly what we have.
This method is not explained since it should be somewhat understandable if you read through the above method.
Essentially, all of Mr. Wong's code (splitting the problem into multiple parts and then using the CRT) is unnecessary for this challenge and sagemath's discrete_log
function is powerful enough.
picoCTF{1ca93858}
Note that before continuing you are going to want to have installed on your computer, which you can do by following their .
The backdoor_generator.sage
script discusses two methods of creating backdoored Diffie-Hellman values: Composite Modulus with Hidden Small Subgroup (CM-HSS) and CM-HSO, which I mentioned earlier. CM-HSS is what I thought was originally being used in this challenge since the paper states that it is no longer a Nobody-But-Us backdoor due to the work of () and ().
As we can see in the (which I have saved to this directory) and section 4 of the paper, "two small subgroups" are used. These small subgroups exist because a generator g is chosen "such that both g modulo p and g modulo q generate 'small' subgroups." The existence of the subgroups allows "us to compute two discrete logarithms in two small subgroups instead of one discrete logarithm in one large group." The paper states "For example, we could pick $p$ and $q$ such that $p-1 = 2_{p_1p_2}$ and $q-1 = 2_{q_1q_2}$ with $p_1$ and $q_1$ two small prime factors and $p_2$, $q_2$ two large prime factors." Indeed, if we open and look at the examples at the start of the test_CM_HSS
function we see a small factor for p-1
and another small factor for q-1
. We can confirm this by taking the bigger p
value and getting prime factors of p-1
in SageMath by running the below command:
Luckily for us, David Wong has written most of the SageMath code for us in the test_CM_HSO
function in the test script. If you look at our script, you'll notice that almost all of it is copied pasted from the test_CM_HSO
function. On lines 1-12 of our script, we define the values we were given and that we figured out so far, then we calculate p-1
and q-1
, and finally we compute the prime factors of p-1
and q-1
. Then, we simply apply the algorithm discussed in the paper and presentation to go through the many small subgroups and reconstruct the private key. At the end we print the private key solution and we compute $c=g^x\bmod n$ (where $x$ is the private key that we found). This is exactly how the $c$ value given to us was computed so if the values are the same then we have found a valid private key. Running the script with sage solve.sage
produces the following output:
The encrypted messages match so we have found the correct key. So, we have solved the discrete logarithm $c=g^x\bmod n$ for $x$. However, trying to decode this to ascii produces gibberish. This is because . Essentially, it has something to do with group theory. The quite nicely that $\phi(n)=(p-1)(q-1)$ (where $\phi(n)$ is the size of the group) if $n=pq$. According to the paper several properties of p
and q
result "in a non-cyclic group with an upper-bound on possible subgroup orders of $lcm(s,t)=\frac{(p-1)(q-1)}{2}$." So, gmpy2.lcm(p-1, q-1) == ((p-1)*(q-1))//2
. Therefore, is $\lambda=\frac{(p-1)(q-1)}{2}$. The Carmichael function $\lambda(n)$ is the exponent/order of the . ( explains more about the Carmichael function. We don't use Euler's totient because the Carmichael function is more generalized.) Therefore, we can add or subtract $\frac{(p-1)(q-1)}{2}$ from the solution our SageMath script found (because we have a ) and then compute $c=g^x\bmod n$ with our modified solution and still get the original $c$ value.
I'm still unsure about the math behind this. So, I messed around with the script for a while and eventually found that subtracting $\frac{(p-1)(q-1)}{4}$ from the computed key worked successfully and got the flag.
First, use Pollard's p-1 algorithm to calculate p
and q
. You can use the implementation in .
Second, run , which will print the hexadecimal key, 0x7069636f4354467b31636139333835387d
, and the flag.