Dynamic i - cs.duke.edu

Dynamic i - cs.duke.edu

Dynamic informationflow tracking Landon Cox March 24, 2017 Information flow Crucial goal of secure system Prevent inappropriate information flows Can model appropriateness with a lattice of tags i.e., only allow low objects to flow into high objects Non-interference := all flows are appropriate Information-flow analysis Helps track where sensitive data goes Getting this right is tricky

Information flow Building blocks Storage objects (information receptacles) Processes (move information to/from objects) Tracking information Tag (or label) describes information sensitivity Each storage object is assigned a tag Need to update tags as processes execute Information flow Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged

What must we assume about any of Ps outputs? Must assume that they contain sensitive information Which processes are allowed to communicate with P? Other processes that are allowed to read D Why is this problematic? Probably want P to communicate with processes that cant access D Hard to do anything useful otherwise Information flow

Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } Passwor d file Information flow

Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client uid/pw accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } Passwor d file

Information flow Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client uid/pw accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; }

Passwor d file Information flow Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client uid/pw accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell;

} Passwor d file Information flow Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client uid/pw accept uid/pw; if (pw not in file) { return error;

} else { fork/exec shell; } How do you solve this? Passwor d file Information flow Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client uid/pw

accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } How do you solve this? Often use a trusted declassifier Passwor d file Information flow Issue 1: precision

Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged SSH client Small piece of code trusted to remove tags uid/pw Declassifi er accept uid/pw;

if (pw not in file) { return error; } else { fork/exec shell; } Passwor d file Information flow Issue 1: precision Say that storage object is an address space If process P reads sensitive data item D Ps entire address space is tagged What else could we do to improve precision? Use finer-grained storage objects Tag program variables or memory words

What are the implications for performance? Have to update tags much more frequently i.e., every time an instruction executes Can introduce a lot of overhead Tracking explicit flows Propagate taint tags with data flows c a op b taint(c) taint(a) taint(b) setTaint(a,t) taint(a) {t} c = a + b taint(c) {t} {} = {t}

Send(c,foo.net) Can foo.net see a? Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control flow// a is sensitive int foo (int a){ int b, w, x, y, z; a = 11; b = 5; w = a * 2; x = b + 1; y = w + 1; z = x + y;

print (z); } Each line is an explicit flow from source operands to destinatio n operand Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control

flow// a is sensitive int foo (int a){ int b, w, x, y, z; a = 11; b = 5; w = a * 2; x = b + 1; y = w + 1; z = x + y; print (z); } Very easy to implement: just interpose on each instruction to update

each vars tag Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control flow // a is sensitive Where is the implicit flow? void foo (int a) { int x, y; if (a > 10) {

x = 1; } y = 10; print (x); print (y); } Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control flow // a is sensitive How would you update

xs tag? void foo (int a) { int x, y; if (a > 10) { x = 1; } y = 10; print (x); print (y); } Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control flow

// a is sensitive What is tricky about this code? void foo (int a) { int x, y; if (a > 10) { x = 1; } else { y = 10; } print (x); print (y); } Information flow

Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control flow // a is sensitive What is trickier about this code? void foo (int a) { int x, y; if (a > 10) { baz (&x); } else { bar (&y);

} print (x); print (y); } Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control // a is sensitive flow void foo (int a) { Where is the implicit flow here?

} int x, y; if (a > 10) { exit(0); } else { exit(1); } y = 10; print (x); print (y); Information flow Issue 2: explicit vs implicit flows Two ways to propagate information Explicitly := direct transfer from one object to another Implicitly := indirect transfer usually via control

// a is sensitive flow void foo (int a) { How would you track this? } int x, y; if (a > 10) { exit(0); } else { exit(1); } y = 10; print (x); print (y);

Hidden channels Get system to communicate in unintended ways Example: tenex (supposedly secure OS) Created a team to break in Team had all passwords within 48 hours oops. Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } } Goal: require 256^8 tries to see if password is right Hidden channels: tenex Password checker for (i=0; i<8; i++) {

if (input[i] != password[i]) { break; } } How to break? (user passes in input buffer, virtual mem faults are visible) Specially arrange the inputs layout in memory

Force a page fault if second character is read If you get a fault, the first character was right Do again for third, fourth, eighth character Can check the password in 256*8 tries Course administration Project proposals Due today (ok if you send it to me by Monday) Guidelines in the syllabus One page should be fine Amount of work Three weeks of effort Focus on answering one interesting question Cloud largescale

analysis, collection, dissemination . Mobile present at work, home, and play. Sensors rich, personal data. [email protected]c om Password Username

App-centric operating systems Apps access sensitive information in many contexts Location, images, and communication Home, work, and play Apps run on behalf of many stakeholders Users, services, developers, platform providers, advertisers Monitoring app behavior Permissions are coarse. No insight into what is collected and by whom.

Consumer: Why is my wallpaper app sending my phone number to another country? http://blog.mylookout.com/2010/07/mobile-application-analysis-blackhat/ Enterprise: Who is collecting information about our workers? Wider interest in the issue http://online.wsj.com/article/SB20001424052748703806304576242923804770968.h Emerging malware threat

New mobile malware1 New mobile malware family or variant2 Afee Threats Report: Q1 2012 - http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q1-2012 ecure Mobile Threat Report Q1 2012 - http://www.f-secure.com/weblog/archives/MobileThreatReport_Q1 Where does data go after you grant access? Monitoring goals Monitor where apps send data What happens after you grant access? Is observed behavior expected? Monitor apps at runtime Want users to monitor their own apps Must balance accuracy and efficiency

Solution: TaintDroid Original collaboration with Penn State, Intel Will Enck (NCSU), Jaeyeon Jung (Samsung), others Taint tracking TaintDroid: system-wide taint tracking for Android Records explicit data dependencies via taint tags Does not capture implicit data dependencies Check tags of emitted data Track how information propagates Tag data as enters

app Usernam e Passwor d [email protected] m Taint tracking TaintDroid: system-wide taint tracking for Android Records explicit data dependencies via taint tags Does not capture implicit data dependencies Key issues for tag propagation How are tags stored? What is the tag-propagation logic?

Is tracking precise and efficient? Project website: http://appanalysis.org Tag propagation Goal: balance precision and efficiency Fast Process-grained (All outputs tainted) Ideal Slow Imprecise Instruction-grained (2-20x overhead) Precise Multi-level approach

Variable-level tracking through Dalvik VM (DEX instructions) Patch state after native method invocation Extend tracking to IPC and file system Message-level tracking Application code Dalvik VM ms g Application code Dalvik VM

Native system libraries Network File system Variable-level tracking Method-level tracking File-level tracking Variable-level tracking Tag-propagation logic for Dalvik executables (DEX) Variable-level tracking

out0 Modified Dalvik VM out0 taint tag Store and propagate 32-bit tags Store tags adjacent to vars on stack Correspond to VM registers 64-bit vars require two tags Class fields Store tags inside heap objects

Arrays One tag per array Trade precision for efficient storage out1 Local vars and args Performance optimizations Per-variable tags reduce storage overhead Adjacent tags provide spatial locality SP

out1 taint tag (unused) VM goop FP v0 == local0 v0 taint tag v1 == local1 v1 taint tag v2 == in0 v4 taint tag Method-grained tracking Huge opportunity for performance gains JNI code is often CPU intensive

Challenge for method-grained tracking In worst case, must manually reason about side-effects Luckily, a very simple heuristic works most of class java.lang.Math { the time public static double cos (double d); } Method-grained tracking Tainting heuristic Assign union of arguments tags to return value on exit.

Most JNI methods have no side effects Many JNI methods operate on native types When it doesnt work, use method profiles Generic framework for defining argument/retval dependencies So far, only needed to define for IBM charset converter See paper for more details class java.lang.Math { public static double cos (double d); } Method-grained tracking Found 2,844 JNI methods in Android

source 913 did not use Object references Others could induce false negatives Third-party JNI is not supported Apps must be written entirely in Java Survey of Android Market, ~25% used .so file Subject of ongoing research Evaluation Is TaintDroid fast and precise? Fast Process-grained (All outputs tainted) TaintDroid Slow Imprecise

Instruction-grained (2-20x overhead) Precise Performance evaluation CaffeineMark 3.0 Score Android 2000 1800 1600 1400 1200 1000 800 600 400 200 0

TaintDroid 20% overhead (extra memory accesses) Not shown 4.4% memory overhead sieve loop 14% overhead logic

string float CaffeineMark 3.0 Benchmark (higher is better) method total Performance evaluation CaffeineMark 3.0 Score Android 2000 1800 1600

1400 1200 1000 800 600 400 200 0 TaintDroid Reasons for efficiency (1) Method-grained tracking of JNI calls (2) Spatial locality of taint tags (3) One tag per array sieve

loop logic string float CaffeineMark 3.0 Benchmark (higher is better) method total App study Selected 30 apps from Android Market Biased toward popular apps

Sampled from 12 categories App permissions Access to Internet Access to location, camera, phone state, mic No native libraries Ran apps manually under App study Of 105 flagged connections, only 37 to expected servers App study: location 15 of 30 apps shared location with ad server admob.com, ad.qwapi.com, ads.mobclix.com, data.flurry.com

Most traffic was plaintext (e.g., AdMob HTTP GET) ...&s=a14a4a93f1e4c68&..&t=062A1CB1D476DE85 ...&s=a14a4a93f1e4c68&..&t=062A1CB1D476DE85 B717D9195A6722A9&d%5Bcoord B717D9195A6722A9&d%5Bcoord %5D=47.661227890000006%2C-122.31589477&... %5D=47.661227890000006%2C-122.31589477&... data.flurry.com used binary format In no cases were users informed by EULA In one case, app sent location every 30 seconds App study: phone identifiers 7 apps sent device id (IMEI) 2 apps sent phone info (Ph. #, IMSI*, ICC-ID)

Done without informing the user One apps EULA indicated the IMEI was sent Another app sent the hash of the IMEI Frequency was app-specific One sent info every time the phone booted appanalysis.org Source code available http://appanalysis.org/ Most recent version is for Android 4.3 Great platform for research Compatible with vast majority of Android apps Playground for all kinds of information-flow projects Video demo by Peter Gilbert

TaintDroid demo http://www.youtube.com/watch?v=qnLujX1Dw4Y Media coverage Limitations Implicit flows Fundamentally difficult problem Can handle passwords (SpanDex, USENIX Sec) Native code Ongoing work Talk to Ali!

Recently Viewed Presentations

  • J&amp;A Slide Show Training

    J&A Slide Show Training

    ASA(ALT) SACO (Correspondence Office) ASA(ALT) SACO (Correspondence Office) Return to HQ DA Staff. Approved J&A to HQ USACE. Process may be stalled at any of these review points for missing documentation, clarification or additional information & coordination. Timeline interruptions may...
  • Managing Inhibitor Risk in both PUPs and PTPs

    Managing Inhibitor Risk in both PUPs and PTPs

    The immunology of product neoantigenicity Optimal design of preauthorization clinical trials is hampered by an incomplete understanding of the precise nature of interactions between an FVIII product and the recipient's immune system.
  • Universal Design in Acton - Madison Area Technical College

    Universal Design in Acton - Madison Area Technical College

    Visual Magnocellular Deficit Theory (Eden,2016) In a study with high school students with dyslexia, Schneps (2013) found that reading on the palm-sized screen of an iPod Touch reduced inefficiencies in the ways students' eyes flitted across the page.
  • Upland challenges in the prairie

    Upland challenges in the prairie

    Upland challenges in the prairie … High commodity prices are great for farmers… Not so great for grassland birds and pollinators. From 2008-2012, plowed under 7.2 million acres for crops.
  • &quot;This isn&#x27;t a job; it&#x27;s a lifestyle.&quot;

    "This isn't a job; it's a lifestyle."

    3 Lanes of traffic on racetrack- breezing, galloping and walking. Grooms and exercise riders. Grooming the Horse after Workouts. Horses have different personalities. Some like to bite. Grooming isa two person job. Some of the equipment to groom a horse.
  • Chapter 8- Introduction to Hypothesis Testing

    Chapter 8- Introduction to Hypothesis Testing

    In the language of statistics convicting the defendant is called . rejecting the null hypothesis in favor of the alternative hypothesis. That is, the jury is saying that there is enough evidence to conclude that the defendant is guilty (i.e.,...
  • [Title With Capital Letters]

    [Title With Capital Letters]

    * PERSPEKTIVER FOR EN FREMTIDIG FINANSIERINGSMODEL Den vigtigste erkendelse er, at der ikke findes en perfekt model - men en række modeller, der prioriterer på forskellig vis vægter nogle hensyn på bekostning af andre hensyn Det er selvfølgelig en politisk...
  • Piedmont Community College&#x27;s Workforce Certification Academy ...

    Piedmont Community College's Workforce Certification Academy ...

    Certified Production Technician MSSC Modules. Safety Quality practices and measurement. Manufacturing processes and production. ... Test taking . Orientation to CPT training-simple terms. ... PCC's CPT Program Recognized as NC Best Practice.