Thursday, November 20, 2014

Euclidean Algorithm, JavaScript and Proof

In this post we will present a JavaScript implementation of the Euclidean Algorithm, and present a proof of its correctness.

Introduction

 The Euclidean Algorithm, was published in "Elements" (300 B.C. !!!), by the Greek mathematician Euclid. It is one of the oldest algorithms that still in use. It is a method to compute the greatest common divisor, and find multiplicative inverses in modular arithmetic.


Some action first

   Here we present a simple JavaScript implementation:

function gcd(a, b){
 a = a * (a < 0 ? -1 : 1); 
 b = b * (b < 0 ? -1 : 1); 

 if (b > a)
  return gcd (b, a);
 else if (b == 0)
  return a;
 else 
  return gcd (b, a % b);  
}

  Let's try to use this code, just put some numbers and see the result:

gcd(, )

 That's a simple application, this algorithm is useful in cryptography to find multiplicative inverses in modular arithmetic, so with a little extra complexity, we can put a richer algorithm in action.
 This is the so called Extended Euclidean Algorithm, we have the Bézout's identity, witch states that for every pair of integers (a, b) we have a pair of integers (x, y) witch satisfy:
a.x + b.y = gcd (a, b)
  So our next step is to present an algorithm that given a pair of integers (a, b) it returns the gcd(a, b) and the pair of integers (x, y) witch satisfy the Bézout's identity, here is our code:

function extended(a, b, obj){
 var i = (a < 0 ? -1 : 1);
 var j = (b < 0 ? -1 : 1);

 a *= i;
 b *= j;

 obj.x = 1;
 obj.y = 1;

 if (b > a)
  var result = extended (b, a, obj);
 else if (b == 0)
  var result = a;
 else {
  var result = extended (b, a % b, obj); 
  var tmp = obj.y;
  obj.y = -obj.y * Math.floor(a / b) + obj.x; 
  obj.x = tmp;
 }

 if (b > a){
  var tmp = obj.y;
  obj.y = obj.x;
  obj.x = tmp;
 }
 obj.x *= i;
 obj.y *= j;

 return result;
}

 So here are our Extended Euclidean Algorithm in action:

a:

b:



As an example we can see an application to cryptography, suppose you want to solve this equation, for d:
gcd(e, m) = 1
e.d ≡ 1 (mod m)
 For e and m known, you can simply run our Extended Euclidean Algorithm, with (a, b) = (e, m), witch will give us:
e.x + m.y = 1
e.x ≡ 1 (mod m) 
d ≡ x (mod m) 
 This equation arises naturally in some cryptography systems, like RSA, and solve it is part of it's implementation. Let's see an example:
n = 2047 = 23 * 89
m = (23 - 1) * (89 - 1) = 1936
e = 179
 We want to find d, witch satisfy:
e.d ≡ 1 (mod m)
 Applying the above algorithm we get:
1936 * 141 + 179 * -1525 = 1
 ⇒ 179 * -1525  1 (mod 1936)
 ⇒ d = 411 
since: -1525  411 (mod 1936)
 This is exactly the example we used here!


Proof of Correctness

 Now we want to proof that the Euclidean Algorithm is correct! Let's first assume that:
|a| ≥ |b| 
 This can always be done, without loss of generality, simple if |b| > |a|, change what we call a and b. So we can write:
a = y.b + r
for y = ⌋|a| / |b|⌊
 Now, if r = 0, we have:
 a = y.b ⇔ gcd(a, b) = b.
 If |r| > 0, we have:
a = y.b + r ⇒  a ≡ y.b + r mod(gcd(a, b)) 
⇒  0 ≡  r mod(gcd(a, b)) ⇒ gcd(a, b)  = gcd(b, r)
 Since |r| < |b|, if we repeat |b| or less this operation, we will get |r| = 0, and the algorithm always finish. So we conclude our proof.

 Now let's give a proof for the Bézout's identity, witch states that for any pair of integers (a, b) we have a pair of integers (x, y) that satisfy:
a.x + b.y = gcd (a, b)
  So let's apply the Euclidean Algorithm n times until we get r = 0, let's call a = r1 and b = r2:

(1) r1 = y1.r2 + r3
(2) r2 = y2.r3 + r4
(3) r3 = y3.r4 + r5
...
(i) ri = yi.ri+1 + ri+2
... 
(n) rn = yn.rn+1
  As we can see, yis an integer defined by:
yi = ⌋|ri| / |ri+1|⌊
  And as pointed in the first proof:
gcd(a, b) = rn
  Let's obtain Bézout's identity for step (n -2):
rn-2 = yn-2.rn-1 + r
⇒ rn-2 - yn-2.rn-1 = rn  
⇒ rn-2 - yn-2.(rn-3 - yn-3.rn-2) = rn 
 ⇒ rn-2.(1 + yn-2.yn-3) - yn-2.rn-3 = rn 
⇒ rn-2.x + rn-3.y = gcd(rn-2,rn-3) 
with x = (1 + yn-2.yn-3)
and y = - yn-2
and gcd(rn-2,rn-3) = rn
 We can go back, for step (n - 3) and go on until we get a and b, it's clear that this is not a formal proof, we have to make an inductive proof, since I'm a physicist I will give an example and omit the formal proof, let's apply this stuff for gcd(746, 102):
(1a) 746 = 102 * 7 + 32
(2a) 102 = 32 * 3 + 6
(3a) 32 = 6 * 5 + 2
(3b) 32 - 6 * 5 = 2
(2b) 32 - (102 - 32 * 3) * 5 = 2 ⇒ 102 * (-5) + 32 * 16 = 2
(1b) 102 * (-5) + (746 - 102 * 7) * 16 = 2 ⇒ 746 * 16 + 102 * (-117) = 2
 Observe that the pair (x, y) is not unique, since:
746 * (-35)  + 102 * 256 = 2

Monday, November 17, 2014

RSA cryptosystem

In this post we will introduce some ideas behind the RSA cryptosystem, and present a simplified javascript implementation.

Motivation

 The RSA cryptosystem (Ron Rivest, Adi Shamir, Leonard Adleman; 1977) is one of the pioneers of public key cryptosystems.
  A public key cryptosystem is based on a pair of keys: Ke (encryption), Kd (decryption) in witch we expect that is not viable to encipher/decipher text without knowing the respective key and we expect that is trivial knowing it. Here we use some imprecise words, by not viable we mean that it's very expensive computationally and can't be done in a reasonable amount of time and by trivial we mean that it's inexpensive computationally.
  RSA uses as motivation two facts of the current state of number theory:

  1. There is no quick algorithm known to factorization a large number.
  2. It's very quick to test if a number is prime.
  If in the next years someone discover a quick algorithm for factorization, the RSA will no longer be useful. But this is very unlike to happen, given that there is a long history of tries.

How it works

  We choose two large primes: p, qand set:
n = pq
 Again our assumption is that is rather easy for a computer to find two large primes p, q. But knowing n we cannot find the factors p, q in a reasonably amount of time.
  Next we choose a number e, that has no common factor to (p - 1)(q - 1) (for example choose an e that is a prime number) and have the propriety that:
max(p, q) < e < (p - 1)(q - 1).
  After that we find:
d ≡ e-1 mod (p - 1)(q - 1).
For a reference on how to solve this equation take a look at here!
In this situation we will have that:

  1. Ke = (n, e).
  2. Kd = (n, d).
  3. ƒ(P) ≡ Pe mod n    where P stands for plain text.
  4. ƒ-1(C) ≡ Cd mod n   where C stands for cipher text.

A sketch of proof

 Now we have to prove our assumption, witch is by no means obvious, that encipher and then deciphering will return the plain text:
P ≡ ƒ-1(ƒ(P)) ≡ ƒ(ƒ-1(P)) ≡ Ped (mod n). 
 For this proof we will use Fermat's little theorem:
xp-1 ≡ 1 (mod p)
for any prime p and any number x that is not divisible by
Proof:
(0.x (mod p), 1.x (mod p), 2.x (mod p), ... , (p - 1).x (mod p))
is a rearrangement of the sequence
(0, 1, 2, ...,  (p - 1)) 
Suppose that is not true, so there is:  
i, j (0 ≤ i, j < p and i ≠ j) with:
 i.x ≡ j.x (mod p) 
i - j ≡ 0 (mod p)
This is obviously false because p is prime. 
1.x . 2.x . ... . (p - 1).x ≡  1 . 2 . ... . (p - 1) (mod p)
xp-1.(p - 1)! ≡ (p - 1)! (mod p)
Since p does not divide (p - 1)!
xp - 1 ≡ 1 (mod p)
And we finish our proof.

Now we are in condition to prove:
P  ≡ Ped (mod n).
Proof:
First of all, we have to notice that what we want to prove is equivalent
(1) P  ≡ Ped (mod p)
(2) P  ≡ Ped (mod q).
Because p and q are relative primes. Now since:
e.d  ≡ 1 mod (p - 1)(q - 1)
We have that:
e.d - 1 = (q - 1).(p - 1).λ
for λ some positive integer. So if q does not divide P,  we have in (2):
Ped ≡ Ped-1. P(q - 1)(p - 1)λ.≡ (P(q - 1))(p - 1)λ.P (mod q)
Now by Fermat's little theorem we know that:
P(q - 1) ≡ 1 (mod q)
So we can conclude:
Ped ≡ (P(q - 1))(p - 1)λ.P ≡ 1(p - 1)λ.P ≡ (mod q).
Notice that if q divides P equation (2) is trivially true, and we can proceed in same fashion for equation (1). Now we finish our proof.

 

Javascript implementation

   Now we want to have some practical feeling about the algorithm. First of all we need to program a function to do the dirty work in modular exponentiation:
xe   (mod m)

function modexp (x, e, n){
 if (e == 0)
  return 1;
 if (e == 1)
  return x % n;
 if (e % 2 == 0)
  return modexp ((x * x) % n, e / 2, n);
 return (x * modexp (x, e - 1, n)) % n; 
}

  We will work with the ASCII printable characters (95 total): empty space (code 32) to tilde (code 126), both for the plain and the cipher text, we will use every letters in plain text as P, and we will return 2 letters as C, so:
951 < n < 952
  Obviously this situation is not what is used by your bank in real transactions. Here is our encryption function:
function RSA_encrypt (plain, n, e){
 var min = 32;    // first character
 var max = 126;    // last character
 var rng = max - min + 1; // total number
 
 var cipher = "";
 for (var i = 0; i < plain.length; i++){
  x = plain.charCodeAt(i) - min;
  x = modexp(x, e, n);
  
  k1 = Math.floor(x / rng) + min;
  k2 = x % rng + min;

  s = String.fromCharCode(k1, k2);
  cipher = cipher.concat(s);
 }
 return cipher;
}

 Here is our decryption function:

function RSA_decrypt (cipher, n, d){
 var min = 32;    // first character
 var max = 126;    // last character
 var rng = max - min + 1; // total number

 var plain = '';
 for (var i = 0; i < cipher.length; i+=2){
  x = (cipher.charCodeAt(i) - min) * rng;
  x += cipher.charCodeAt(i + 1) - min;
  x = modexp(x, d, n);
  plain = plain.concat(String.fromCharCode(x + min));
 }
 return plain; 
}

And here this code in action for:
(n, e, d) = (2047, 179, 411)







For understand how we get the values (n, e, d) take a look at here!

Sunday, November 9, 2014

Counting problems and Statistical Physics

 In this post we will analyze four variants of a counting problem and introduce it's relation with statistical physics.


 The problem

We have N balls inside an urn, we want to count the number of possible outcomes when we draw M balls.

 The variants

We have to choose if the sequence of the balls that we draw matters and if every time we draw a ball we replace it or not.


Case 1: Maxwell-Boltzmann statistics

Ordered sample with replacement

(1, 1)(1, 2)(1, 3)(1, 4)
(2, 1)(2, 2)(2, 3)(2, 4)
(3, 1)(3, 2)(3, 3)(3, 4)
(4, 1)(4, 2)(4, 3)(4, 4)

Table 1: The 16 possibilities for N = 4, M = 2


For a generic value of N and M, the number of possibilities is:
NM
This case is equivalent to arrange M distinguishable particles inside N cells where the cells can contain multiple particles. For example classical particles, witch we assume that are distinguishable.

Case 2

Ordered sample without replacement

(1, 1)(1, 2)(1, 3)(1, 4)
(2, 1)(2, 2)(2, 3)(2, 4)
(3, 1)(3, 2)(3, 3)(3, 4)
(4, 1)(4, 2)(4, 3)(4, 4)

Table 2: The 12 possibilities for N = 4, M = 2

For a generic value of N and M, the number of possibilities is:
N! / (N - M)!
This case is equivalent to arrange M distinguishable particles inside N cells where each cell can contain only one particle.


Case 3: Bose-Einstein statistics

Unordered sample with replacement

(1, 1)(1, 2)(1, 3)(1, 4)
(2, 1)(2, 2)(2, 3)(2, 4)
(3, 1)(3, 2)(3, 3)(3, 4)
(4, 1)(4, 2)(4, 3)(4, 4)

Table 3: The 10 possibilities for N = 4, M = 2

For a generic value of N and M, the number of possibilities is:
(N + M - 1)! / M! (N - 1)!
This case is equivalent to arrange M indistinguishable particles inside N cells where the cells can contain multiple particles. For example bosons (proton) are the indistinguishable particle witch do not obey Pauli exclusion principle. This is the so called Bose-Einstein statistics.

Case 4: Fermi-Dirac statistics

Unordered sample without replacement

(1, 1)(1, 2)(1, 3)(1, 4)
(2, 1)(2, 2)(2, 3)(2, 4)
(3, 1)(3, 2)(3, 3)(3, 4)
(4, 1)(4, 2)(4, 3)(4, 4)

Table 4: The 6 possibilities for N = 4, M = 2

For a generic value of N and M, the number of possibilities is:
N! / M! (N - M)!

This case is equivalent to arrange M indistinguishable particles inside N cells where each cell can contain only one particle. For example fermions (electron) are the indistinguishable particle witch obey Pauli exclusion principle. This is the so called Fermi-Dirac statistics.