Generate Parsers! Prevent Exploits!

Or

LangSec for Ruby Devs

Or

Hey!

Theory of Computation

is relevant to Web Development

(maybe)

What?

exploits!

formal languages!

generating parsers!


source

source

source

Exploits! They Suck!

Having an app exploited sucks

Attackers find

Chink in armor

hole in the wall

crack in the dam

analogy in analogy

Achilles Heel

Attackers get

crown jewels

DB, eval

shell

ROOT!!

the whole kit n' caboodle

PSA:

Your passwords are hashed and salted, Right?

Exploits ~~ Tricks

Hard to predict

Unrelated

Buffer Overflow

SQL Injection

XSS

all different?

No ! ! !

Exploits are all the same!

Unexpected Computation!

Computation!

Theory of Computation: Hand Wave Edition

*NOT YET*

Exploit != Trick

Exploit == Machine

How does an exploit work?

  1. Takes Input
  2. ??? (Does Stuff)
  3. Output / profit

(profit is important w/ exploits)

Same as regular program

What's different?

Exploit anatomy

undefined behavior in your app

communication channels

Exploits live

inside

your app / framework

Input Via Host Application's Inputs

Exploit === Weird Machine *

* technical term

App |
    | <- IN
    |

(past app code)

App Code |
     /---| <- IN
     |   |
     v   |

(to exploit)

     |     |
     v     |
  Exploit  |

stopping exploits

jam their comms!

validate inputs before use

IN - what is it?

App |
    | <- IN
    |

IN:

instructions to the exploit

Exploit == Weird Machine

IN:

program in exploit-ese

asm_sql_injection anyone?

Example:
Rails XML(type=yaml)

CVE-2013-0156

(not actual program)

1
2
3
4
<yaml type="yaml"> ---- !ruby/object:Evil hi: eval_me </yaml>

...

Running Ruby!

:-(

PSA:

Everybody's patched, right?

What's the real problem here?

Bad Input validation!

What's the fix

Good Input Validation!

No Bad input allowed!

But How?

recognize inputs completely before processing them

LangSec

Language Theoretic Security

recognize inputs completely before processing them


source

recognize inputs completely before processing them

Recognize?

Are you talking about decidability?

yep

Theory of Computation: Hand Wave Edition

Chapter 0: Decidability

In a formal system, do all statements in that system have a yes or no answer?

an example

"This statement is false."

halting problem

undecidable

/handwave

Undecidable problems exist

what about input validation?

:-(

for certain types of input, recognizing is undecidable

being sure an input won't trigger exploit:

impossible

"certain types"

Theory of Computation: Hand Wave Edition

Chapter 1: Formal Languages

Chomsky Hierarchy

Regular

Boring and Safe

No matching parens

features: delimiters


source

Context Free

less boring, mostly safe

features: {(nesting)}


source

Context Sensitive

less boring, mostly safe

features: field length prefixes allowed


source

Recursively Enumerable

can describe turing machine

most powerful


source

Chomsky Hierarchy

Chapter 1.a: Formal Language Recognition

Decidability of formal language recognition

Recursively Enumerable

can describe turing machine

most powerful

Goose has root


source

Context Sensitive

or below

Decidable


source

Chapter 1.b: Comparing Formal Language Parsers

Context Free has two parts

Deterministic

Non-Deterministic

deterministic context-free

comparing parser equivalency:

decidable


source

Woah

recognize inputs completely before processing them

use deterministic context-free languages (or less) as inputs

LangSec no noes

  • Turing Complete Inputs
  • ad hoc input validation
  • weak parsers checking strong languages

Turing Complete Inputs

"This system is very extendable/updatable because it embeds macros/scripting/programming language in data" --run like hell

-- Science of Insecurity

ad hoc input validation

AKA Shotgun parsing

weak parsers

checking

strong languages

1
2
3
if in_xml =~ /valid-codes/ fire_z_missiles(in_xml) end

TL;DR Use JSON

WRT Rails

parsing too early, too generally

        |  IN      |
  Rails |[parse   ]|
        |          |
  ------+----------+
  App   |[validate]|
        |[ use    ]|
        |  IN      |
  Rails |          |
  ------+----------+
  App   |[parse   ]|
        |[validate]|
        |[ use    ]|

Goal:

rm -rf undefined_behavior

don't allow undefined inputs!

Only allow defined inputs!

Patched Postel Principle

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
The Postel Principle Patch: --- ietf/postels-principle +++ ietf/postels-principle - Be liberal about what you accept. + Be definite about what you accept.(*) + + Treat inputs as a language, accept it with a matching computational + power, generate its recognizer from its grammar. + + Treat input-handling computational power as privilege, and reduce it + whenever possible. + + + (*) For the sake of your users, be definite about what you accept. + Being liberal worked best for simpler protocols and languages, + and is in fact limited to such languages; be sure to keep your + language regular or at most context free (no length fields). + Being more liberal did not work so well for early IPv4 stacks: + they were initially vulnerable to weak packet parser attacks, and + ended up eliminating many options and features from normal use. + Furthermore, presence of these options in traffic came to be regarded + as a sign of suspicious or malicious activities, to be mitigated by + traffic normalization or outright rejection. At current protocol + complexities, being liberal actually means exposing the users of your + software to intractable or malicious computations.

cite

Be definite about what you accept.(*)

Treat inputs as a language, accept it with a matching computational power, generate its recognizer from its grammar.

Treat input-handling computational power as privilege, and reduce it whenever possible.

Defining inputs...

Isn't that a pain?

Yes! yes it is!

But, we already do!

Rails input validation

Rails 3 attr_accessible

Rails 3 attr_accessible

1
2
3
attr_accessible :name attr_accessible :name, :credit_rating, :as => :admin

api docs

pluses

all access controlled

blows up on bad keys

minuses

definition / usage in different locations

different use case handling

Rails input validation

Rails 3 attr_accessible

Rails 4 Strong Parameters

Rails 4 Strong Parameters

1
2
3
4
params.require(:person). permit(:name, :age, pets_attributes: [ :name, :category ])

api docs

Pluses

definition / usage in same location

schema allows for nesting

minuses

silently strips unpermitted keys

Schema is ambiguous

Why not earlier?

muskox

Muskox is a schema based Parser Generator

Give it a JSON Schema definition

1
2
3
4
5
6
7
extend Muskox::Extensions add_parser :user, type: :object, properties: { name: { type: :string }, email: { type: :string } }

And it will only allow valid strings to be parsed

1
2
3
4
MyParsers.parsers[:user]. parse( %!{"name":"me", "email":"x@y.com"}!) # => {"name"=>"me", "email"=>"x@y.com"}
1
2
3
4
5
MyParsers.parsers[:user].parse( %!{"hash_dos1":1, "hash_dos2":1, "hash_dos3":1}!) # Muskox::ParserError: # Unexpected property: [hash_dos1] at root. # Allowed properties: [name, email]

Structure

Break Parser in Two

tokenizer

recognizes language (JSON)

passes tokens to validator

validator

validates against provided schema

uses tokens to create Ruby objects

MuskOx w/ Rails

Replace strong params

1
2
3
4
def login_params params.require(:user). permit(:login, :password) end

Replace Strong Params

1
2
3
4
muskox_params :user_params do |m| m.require(:user). permit(:login, :password) end

Muskox Future

new tokenizer formats: XML, Form encoded data, ...

References

langsec.org

github.com/baroquebobcat/muskox

Nick Howard

@baroquebobcat



photo (mod w/ sombrero)