An Unusual Approach to Integer Factorization
Keywords: RSA, semiprime, factor, factorization, factorisation, XML, recursion, cryptography, security, serialization
- Download source - 6 Kb
This article discusses an unusual way to find the prime factorization of numbers of an arbitrary size. It uses the dom4j open source library for working with XML.
Background
I discovered this approach in 2004, while giving some thought to the RSA Challenge. It isn't currently suitable for factoring numbers of that size, although the code is quite capable of finding the factorization of RSA-length semiprimes, given enough time (or luck). While this technique may not be completely novel, it's not an approach I've ever heard of before.
Given two numbers of equal length (for example, "12" and "34"; "987" and "654"; "1230" and "0321"; etc.), the product of any two multiplicands ending in those numbers will have products which all end in the same digits as one another, up to the exact length of the original numbers. For example, the product of 12 and 56 is 672. Since 12 and 56 are both two digits in length, we observe that the last two digits of their product are "72". This means that any two numbers ending in 12 and 56, when multiplied together, will end in 72. For example:
12 * 56 = 672
912 * 256 = 233472
95126812 * 32684556 = 3109177613915472
This property can be exploited to derive the factorization of composite numbers, or to prove primality. It can be assumed, for example, that a product ending in the digit "1" (for example, "4217821") might have as its two factors number pairs ending in (1,1), (3,7), or (9,9), since:
1 * 1 = 1
3 * 7 = 21
9 * 9 = 81
Carrying on with the base candidate pair (1,1), we can generate further candidate pairs by prepending numbers to each candidate pair (making sure to avoid reciprocals such as (21,01) and (01,21):
31 * 91 = 2821
41 * 81 = 3321
51 * 71 = 3621
61 * 61 = 3721
11 * 11 = 121
21 * 01 = 021
This process is repeated recursively for each new candidate pair. The next pair under test would be (31,91), and so on, eventually yielding the factors (1031,4091). Candidate pairs are rejected when their product exceeds the value of the composite being factored. An iterative implementation, while possible, would require extraordinary amounts of memory for large factorizations. At the very least, then, this provides a real-world example of how recursion can be used to reduce the computing resources necessary to solve a problem.
Had the first node evaluated been (3,7) and not (1,1), a factorization would not have been found for that node. Eventually, every candidate child node of (3,7) would have been marked "dead" – that is, their products all grew larger than the value of the product under test. The node (3,7), having failed to produce a factorization or any further viable candidate pairs, would collapse, causing the algorithm to proceed to the next base candidate pair for analysis – either (1,1) or (9,9).
This factorization algorithm was simple to implement using Java's BigInteger class. Further, this technique works in any radix, and is deterministic (that is, the technique is not a wholly random approach – it is guaranteed to find a solution, given enough time). Randomness can be used to alter runtime unpredictably – for better or worse. Modifying the radix in which the factorization is performed will also impact runtime.
If this technique is to enjoy any real application (particularly to factor RSA-length semiprimes), it seems necessary to extend it by some means. Pattern detection is one possible enhancement. If a candidate string were to exceed some threshold of probability for bitwise entropy, the number and all its potential children could be eliminated from consideration, reducing processing time.
It is appropriate to note here that radixes which are powers of two are best suited to bitwise pattern detection, since growing a candidate in those bases by repeatedly prepending numbers does not alter the bits from previous iterations. An illustration will be helpful.
Base 10 multiplication:
5 * 6 = 30 (1 1110)
15 * 26 = 390 (0001 1000 0110) bits are changed!
Base 16 multiplication:
4 * A = 28 (10 1000)
24 * 1A = 3A8 (11 1010 1000) all bits from the parent candidate pair are unchanged
Distributed processing is another possible enhancement. Since the Java code which implements this technique uses XML to manage its calculations, it would be trivial to modify the code to accept work from and submit results to a central server. Similarly, XML serialization could be used to implement a suspend-to-disk feature, so that work could be paused and resumed at will. I am neither a cryptographer nor a mathematician, and so cannot say whether my technique is entirely novel, or even useful in a practical sense. Certainly I have never seen this approach described elsewhere. It is my hope that this observation may be integrated with other techniques in order to decrease the runtime of performing factorizations on RSA-length semiprimes.
Using the code
Using the code is as simple as instantiating the FindFactors class and calling the Factorize, IsPrime, or PrimeFactorize method.
The Factorize method returns NULL if the value of the key is prime; otherwise, the first two factors found for the key are returned. These numbers may or may not be prime themselves. For a complete factorization, use the PrimeFactorize method.
The PrimeFactorize method returns NULL if the key is prime; otherwise, it returns an array of strings containing the prime factorization of the key.
The IsPrime method wraps the Factorize function, and simply returns true if the return value from Factorize was null; otherwise, false.
FindFactors f = new FindFactors();
String[] factors =
f.Factorize(new BigInteger("56071009"), 10, false);
String[] primeFactors =
f.PrimeFactorize(new BigInteger("FA1E", 16), 10, false);
boolean isPrime =
f.IsPrime(new BigInteger("1323"));
The second parameter for Factorize and PrimeFactorize is the radix in which calculations should be performed. This value is completely unrelated to the radix of the number under test. For example, the call to PrimeFactorize above factors the hexadecimal number 0xFA1E, but performs its factorization in base 10 by first converting the number into base 10 in memory, and proceeding from there.
The third parameter for Factorize and PrimeFactorize are the radix of the number under test (above, 10 and 16, respectively), and a Boolean value indicating whether or not to shuffle the sequence of numbers prepended to candidate pairs on each pass. Using randomization has a practical impact on the length of time required to perform a factorization, although it can't be known beforehand whether or not that impact will be helpful or not.
The IsPrime method has two overloads, one in which the radix of operations is specified, and another in which a default base of 6 is used. Both overloads of this function use randomization when prepending.
Points of Interest
Prime numbers are cool.
History
30 May 2005 - Uploaded to teh Intarweb.
6 Mar 2005 - First pass at converting code into a sample project and article.
1 Sep 2004 - Experimented with pattern detection for bits and characters in factors.
30 Mar 2004 - Recursive implementation written.
17 Feb 2004 - Initial idea conceived; iterative implementation written.

4 Comments:
This idea can be done iteratively without the huge resources you assume.
Many people have also come up with this idea, but it is always rejected since it does not improve upon trial division, which is already too slow.
Do you want free porn? Contact my AIM SN 'p1nkn3ss' just say 'give me some pics now!'.
No age verification required, totally free! Just send an instant message to AIM screen name "p1nkn3ss".
Any message you send is fine!
AIM abuse can be reported here.
Get any Desired College Degree, In less then 2 weeks.
Call this number now 24 hours a day 7 days a week (413) 208-3069
Get these Degrees NOW!!!
"BA", "BSc", "MA", "MSc", "MBA", "PHD",
Get everything within 2 weeks.
100% verifiable, this is a real deal
Act now you owe it to your future.
(413) 208-3069 call now 24 hours a day, 7 days a week.
Alan Turing is often considered to be the father of modern computer science. - - With the Turing test, Turing made a significant and characteristically provocative contribution to the debate regarding artificial intelligence: whether it will ever be possible to say that a machine is conscious and can think. He provided an influential formalisation of the concept of the algorithm and computation with the Turing machine, Find Out More at http://heros4u.com/alan_turing.htm
Post a Comment
<< Home