HardEncrypt Documentation -- Why it's uncrackable

Why the HardEncrypt Package is uncrackable
(or, why it's better than public key schemes--RSA, PVP, etc.)

How does the HardEncrypt package measure up when compared to currently available, popular, internet encryption packages and schemes? Most modern encryption software (for instance, web browser "secure" connections) use public key encryption schemes. These encryption schemes us two keys in the message transfer process, a public and a private key, and they make use of one-way functions. A common pattern of usage is as follows: the message sender encrypts the message using the public key and a one-way function, and the receiver decrypts the message using a private key and a different one-way function. Transmitting a message in the opposite direction requires a different set of keys. The great thing (and only great thing) about public key encryption schemes is that they are incredibly convenient. Anyone, even an untrusted party, can send a relatively secure message to a receiver using the receiver's public key.

Most current public key implementations rely on features of prime number factorization for their keys and their one-way functions. Prime numbers are integers that can't be evenly divided by any other numbers except for 1 (of course, they can also be evenly divided by themselves). All integers, prime or composite (not prime), can be broken down into a set of prime number factors. When the members of this set of prime numbers are multiplied together, the original number is the product. For instance, 100 has the following prime number factorization: {5, 5, 2, 2}. The number 7 has the following prime number factorization: {7}. Certain numbers have a prime number factorization that contains only two prime numbers (21 is an example of this kind of number). For very large numbers like these, finding the two prime number factors is a relatively difficult problem. For example, what two prime numbers can be multiplied together to get 1909? You may have to spend a while with a calculator to figure out the answer.

For these systems, part of the public key consists of a large number that has exactly two prime numbers as its set of factors. We won't go into the details of the one-way functions here since you can find a complete description at the RSA encryption website, http://www.rsa.com/rsa/rsamath/index.html, but suffice it to say that the private key is easily determined if the two prime numbers are known. Thus, RSA and other prime number public key systems rely on the fact that prime number factorization is difficult to do. The above calculator example (1909) should serve as an example of why this problem is difficult. The two prime factors, by the way, are 23 and 83. But if you actually did find them using a calculator, you probably did it by trial and error. Algorithms for prime number factorization do exist, but they are slow.

As of right now, all known algorithms for prime number factorization are quite slow. They are all reducible to an algorithm that tries every possible factor. So, given a large number N, if you know that N has two factors, you can find them by trying to divide N by every number up to sqrt(N). You'll eventually find one of N's two factors this way, and after dividing, you'll have found the other as well. But, there's nothing to say that you won't have to try all possible factors, and there are sqrt(N) of them, so running time of the algorithm is bounded by N, or the size of the number you're trying to factor. To make the encryption keys harder to crack, you simply use a larger N.

A running time of N doesn't seem bad (seems polynomial). However, if you look at the problem as one that takes N as input in binary form, and N is represented by X binary digits, then the running time bound of the algorithm is really sqrt(2^X) = 2^(X/2), or exponential in the input size . This is a *very* slow algorithm--when you increase the input size by 2, the running time doubles. There are algorithms available today that make slight improvements on the running time, but there are no known algorithms that don't have exponential time bounds. Modern cryptosystems rely on the fact that exponential algorithms take (almost literally) forever to run. For instance, if you're using a 128-bit encryption scheme (N is less than 2^128), even if your computer system could check one possible factor every clock cycle (at 500 MHz), it would take your computer 1,169,884,835 years to run the algorithm. That's a very long time indeed.

Notice that, though it may take a long time, it *is* possible to find the two factors. The other thing to notice is that when you've found the private key, you're sure of it.

Though all known algorithms for cracking RSA and other prime number systems are slow, no one has proved that no polynomial time algorithm exists. For instance, if an X^2 algorithm were discovered (vs. a 2^X algorithm), you could solve the above 128-bit factorization problem on the above computer in less than a second. In computer science lingo, problems that are provably hard are called NP-complete. The answers to these problems can be verified in polynomial time, but the answers can only be found deterministically in exponential time. (Well, in theory, no deterministic algorithms exist to solve NP-complete problems in polynomial time, but no one has proved it yet.) Many known problems are NP-complete. As of right now, no one has even been able to prove that prime number factorization is NP-complete. This means that someone could discover a polynomial time algorithm for this problem at any time. Not that someone ever will, but they could. If they did, RSA's security would fall apart instantaneously.

It should be noted at this point that quantum computers could be used to calculate prime number factorization very quickly. Quantum computers use the spin states of atom nuclei to represent binary states (e.g., spin up, spin down). By taking advantage of quantum physics superposition properties, all possible binary inputs can be fed into a quantum digital "circuit" at the same time, and all of the corresponding results are produced at the same time. Thus, to factor a large N, you'd simply have to feed all possible factors into a quantum circuit that tests for numbers which divide N. In theory, a quantum computer can be used to complete prime number factorization in polynomial time. However, as of right now, only very simple quantum computers (3-bits) are being experimented with.

The point is not that prime number factorization can be done quickly, because as of right now it can't. The point is that it can be done. The NSA and other such organizations have a lot at stake when it comes to cracking messages encrypted with prime number schemes. For all we know, they may have a non-quantum, polynomial time algorithm already--if they did, why would they announce it? But aside from the possibility of a very fast algorithm, people have been working on algorithms for this problem that make slight speed improvements. The NSA (and others) has supercomputers that run these algorithms quickly. 40-bit prime number schemes (a popular internet implementation) can be cracked by the above computer system (1 factor test per clock cycle, 500 MHz), in about half an hour. The NSA, certainly, has computer systems thousands of times faster than this. However, if there really is no polynomial time algorithm, then the NSA would need a computer 500 million times faster than our example to be able to crack a 128-bit key in 2 years. We can reasonably assume that they don't have computers 500 million times faster than our example system, and we can also assume that they don't have 2 years to devote towards cracking a single person's key.

Now that prime number public key systems have been explained, how does HardEncrypt stack up? Well, to start out with its weak points, HardEncrypt is less convenient. It is a private-private key system that uses symmetric encryption and decryption. Both parties need the same key, so receiving secure messages from untrusted parties is not possible. What's more, exchanging keys can be burdensome since key secrecy is of utmost importance. For public key systems, public key secrecy and exchange are not issues--everyone can have the public key.

Also, HardEncrypt cannot be used for public authentication. With public key systems, users can "sign" a message using their private keys and let the public verify the signatures using the public keys. With HardEncrypt, authentication is still possible, but you can't let the world know that a message is really from you. You can, however, let your communictation partners know that a message is really from you: when they decrypt a message using your key, they know that only your key could have created the encrypted message. If a different key were used to encrypt the message, they would decrypt garbled, meaningless text. Anyone you want to authenticate yourself to needs the key you're using for the authentication. With public key systems, this is not the case.

We should also mention the possibility of fraud. Given an encrypted message, a fake key can be computed that will "decrypt" the message into any message of the same size. Thus, if attackers were trying to frame you for a crime, they could quickly generate a dummy key that would decrypt your message into one detailing the plans for the crime. Of course, during a trial, you could produce a key that decrypted the message correctly (or you could produce a key that decrypted the message into a letter to Grandma), but it would be your word against theirs. If you found yourself in this unfortunate situation, you could probably demonstrate in court that a fake key can be made to decrypt your message into EVERY message, so having a key that decrypts the message in one particular way says nothing about what the message really contained. This could actually be seen as an advantage over RSA and other public key schemes. If you are actually involved in criminal activity (heaven forbid!) and some large organization is trying to trap you, they might be able to crack your RSA key and bring decrypted messages from you to court with them. If these messages do actually detail a crime, they could prove (via the properties of RSA) that the messages are real and are from you. With a HardEncrypted message, the evidence against you would be far weaker.

As for strong points, let's start out with some computer science jargon. Cracking a HardEncrypt key is an NP-complete problem, whereas cracking an RSA key may not be. In fact, cracking a HardEncrypt key is better than NP-complete: it's actually NP-hard. This means that the problem is harder than all other NP-complete problems. In fact, cracking a HardEncrypt key is better than NP-hard: it's undecidable*. Given an encrypted message, it's impossible to find the key. Even if you do guess correctly and stumble on to the right key, there's no way to verify that the key is correct.

In other words, even if attackers (for instance, the NSA) pits all of their best computers and algorithms against your HardEncrypted message, they will never be able to decrypt it, not even if they spend 1,169,884,835 years trying. No one will ever discover an algorithm for decrypting HardEncrption since the problem is undecidable.

So, HardEncrypt offers you a less convenient, but completely secure, encryption alternative. Sure, RSA is fine for some, if not all, messages. Where the line is drawn for RSA security depends on how much progress the NSA (and others) has made in this area. Is 128-bit RSA completely secure? Is 256-bit RSA secure? 40-bit RSA is definitely not secure. If 128-bit RSA is secure, then the NSA is not doing its job, because people are already using 128-bit encryption.

Public key systems are fine for letters to grandma, but when it really matters.... well.... would you use encryption that the NSA can crack in a matter of hours?

* undecidable isn't really "better" than NP-hard. In fact, undecidable problems are a subset of the NP-hard set, and there are other problems in NP-hard that are harder than undecidable. But undecidable problems are harder than some NP-hard problems. For instance, certain problems are NP-hard because their answers can be verified only in exponential time (instead of in polynomial time). The answers to undecidable problems, though they exist, cannot be verified at all. Saying that a problem is NP-hard is a weaker statement than saying a problem is undecidable, so in a matter of speaking, undecidable is better than NP-hard.