Estimating Closeness to the Uniform Distribution on RC4 Keystream Bytes using Property Testing By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 1 Agenda Introduction + Background

RC4, WPA-TKIP and the measure Estimating Closeness via Learning Uniformity testing Paninski test The collision tester Comparing the fingerprints Conclusion Project presentation 8 February, 2017

Tel Aviv University, Faculty of Engineering 2 The RC4 cipher RC4 is a stream cipher that was designed by Ron Rivest in 1987. Very fast and simple in hardware and software. Used in many systems/protocols: WEP,

WPA-TKIP (wireless networks), SSL and more. Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 3 The RC4 Algorithm The algorithm consists of 2 parts: 1. Key Scheduling Algorithm (KSA) 2. Pseudo Random Generation Algorithm (PRGA) The KSA

Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 4 The PRGA Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 5 Biases in the Keystream Empirical distributions (obtained by , 16-byte keys) [AlFardan et al]

1 256 Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 6 [Mantin & Shamir] [Isobe et al] 1 256 Project presentation 8 February, 2017

Tel Aviv University, Faculty of Engineering 7 With further stream locations, the bias power is weakened 1 256 Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 8 WPA-TKIP

Interim solution to replace WEP TKIP per-packet key: Temp Shared Key (16 byte) Weakens Security: TSC-dependent (strong) biases in the keystream [Paterson et al] TSC (6 byte) Key mix 0= 1 Transmitter

MAC address (6 byte) 2= 0 bytes per-packet key 16 Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 9 TKIP TSC-dependent biases Keystream distribution at position 1 Project presentation 8 February, 2017

Tel Aviv University, Faculty of Engineering 10 Keystream distribution at positions 17 and 33 For Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 11 Motivation Find which bytes locations in the stream are

good for encryption (i.e. relatively close to the uniform distribution), versus bad bytes (i.e. farther than some threshold from the uniform distribution). Using the as a measure tool Working on pairs of consecutive keystream bytes How many samples do we need to distinguish between the above two cases? Project presentation 8 February, 2017

Tel Aviv University, Faculty of Engineering 12 The measure Let be two (discrete) probability functions over the domain ; then, the distance between them is: In our case: 0x00-0xFF

+1 0x00-0xFF Range: [0,=65,535] Thus the domain size is Therefore: Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 13 Estimating Closeness via Learning

How to find ? Need a Sample Accurately, needs samples infeasible Have to use approximate methods 1. Draw samples ( ) according to 2. For each domain elements , count how many times it appeared in the sample (denote this value by Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 14

Theorem: :, For a sample size of , the following holds with high probability Corollary (due the triangle inequality): :If , then In our case: (from our initial tests) Therefore, Project presentation

8 February, 2017 Tel Aviv University, Faculty of Engineering 15 Simulation results For ( Distribution learned 0.00784 0.00073 0.06117 0.00481

Recall: Therefore: Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 16 Simulation results For (

Distribution learned 0.00784 0.00073 0.06117 0.00479 Recall: Therefore: Project presentation 8 February, 2017 Execution time of about 10 days!

(on a single CPU) Tel Aviv University, Faculty of Engineering 17 Addressing execution time 1. Distributed network For example 128 processors + threads :Drawbacks a) Requires a relatively large amount of resources b) Eventually the same (total) sample size 2.

Tolerant test Accept, if the distance between the tested distribution and the uniform distribution is less than some predefined threshold : .Reject, if the distance is greater than another predefined threshold such that :Drawbacks a) In the general case, for a constant , [Gregory and Paul Valiant] 3. Uniformity testing Accept if the tested distribution is the uniform distribution. Reject if its distance is greater than . It is known that is sufficient, but also required ( ) [Paninski] Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering

18 Paninski test Important observation: The further a distribution is from the uniform distribution, the greater the number of collisions that will occur in its sample. The algorithm: 1. Draw samples from the tested distribution . 2. Count how many bins have exactly one sample in them (denote this value by ). 3. If some_threshould, reject (the hypothesis that is the uniform distribution), otherwise,

accept. Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 19 Paninski test results Using a sample size of 60,000 < 500 simulations Distribution Project presentation 8 February, 2017

126 23,846 128 116 23,989 24,017 129 24,019 Tel Aviv University, Faculty of Engineering 20 The Collision Tester

Counts the number of colliding pairs in the sample: Used for estimating the collision probability. Based on a similar observation as before; .If , accept; otherwise reject Works also in the general case. The sample size complexity: [Goldreich and2 Ron] Recently by [Diakonikolas et al]

Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 21 The collision tester results For , 100 simulations After Zoom in Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering

22 For Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 23 For Less than 25 minutes

Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 24 The Fingerprint A fingerprint is a vector whose th entry denotes the number of domain elements that appear exactly times in the sample. Can also be described as the histogram of the histogram For example

:Results of rolling a dice 10 times :The histogram that depicts the results (over {1,2,,6}) :The fingerprint obtained Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 25 The fingerprint (of a sample) contains all the information (collision statistics) that required for testing symmetric

properties (such as the distance from the uniform distribution). In particular, the number of colliding pairs can be retrieved from the fingerprint: Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 26 Comparing the fingerprints

Using a sample size of 100 simulations Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 27 Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 28 Project presentation

8 February, 2017 Tel Aviv University, Faculty of Engineering 29 Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 30 Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 31

Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 32 Pr ( 2 =0 ) Project presentation 8 February, 2017 2 256 Tel Aviv University, Faculty of Engineering 33 Pr ( 1=128 )

Project presentation 8 February, 2017 7.5 256 Tel Aviv University, Faculty of Engineering 34 Conclusion Learning the distance between our 4 tested distributions and the uniform distribution requires about samples (about 10 days on a single CPU). Using the collision tester we managed to

distinguish between all 4 distributions even with a sample size of samples (less than 25 minutes). The collision tester can be applied for testing other applications (not only in the RC4 context). Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering 35 Questions? Project presentation 8 February, 2017

Tel Aviv University, Faculty of Engineering 36