Break&Build Security: January 2015

Thursday, January 15, 2015

How do I easily create a random Key?

I've met many developers in my life that are conscious enough to use encryption for confidentiality, and are willing to use algorithms such as HMAC for authenticity and integrity when told.

But when they do implement them, they usually stumble when they have to choose a Key.

They usually choose some word (like names) or phrase, which is probably in a dictionary.

This is specially dangerous if the cipher text is being sent to the browser back and forth, or if for example you are trusting HMAC signed requests and parameters.

Some guy might brute force the Key using a dictionary and be able to decipher the data or craft new data, the same with HMAC, he can create valid requests and do harm.

I know that the best approach is to rotate the generated keys, for example every day, and generate the with a Secure Random algorithm.

But not everyone needs that type of security, so if you need to quickly create a Random Key without using web sites that do that for you, this are some commands you can use on Linux or OSx:

If you are using AES-128 you need a 128 bit key or 16 bytes key.

$ head -c16 /dev/random | base64

change -c16 to whatever you need.

Since /dev/random generates binary data, its best to transform it to hex string or base64 string.
In the code you should decode the string back to binary data before using it.

For HMAC, the size depends on which Hash algorithm you wanna use. Even if MD5 and SHA-1 are no longer considered secure, their implementation with HMAC has not been proven to be insecure.
But if you can choose, choose the SHA-2 family.

The rule of thumb is to use generate a Key at least the size of the hash output.

For SHA-1 it's 160 bits or 20 bytes:

$ head -c20 /dev/random | base64

For SHA-256 as its name tells, 256 bits or 32 bytes.

$ head -c32 /dev/random | base64

I don't believe it's necessary to base64 decode this keys when used with HMAC.

If you want to get Hex coded output use xxd as shown below.

$ xxd -l 32 -p -c80 /dev/random

Just for your information, /dev/random is a device which generates cryptographically secure random number generator (CSRNG) which are "continuously fed" with with entropy. Use of /dev/urandom is also recommended but not available in all platforms (e.g. OSx).

When you start your computer a pool of entropy starts to generate. Since entropy is hard to get the pool fills slowly, so if you keep getting random numbers from that pool you might deplete it, and you will have to wait for new entropy. If this is your case, because maybe you scripted something that constantly read from /dev/random, you can use the device /dev/urandom, which is a non blocking device. Whenever it's depleted it's starts working a pseudo-random number generator, but when it gets new entropy it uses it.
You can read more about the difference between these 2 devices here.

Tuesday, January 13, 2015

Allow external URLs in my site?

If you are running a website that allows end users to create content, almost any interactive site nowadays, you have to be very careful in how you allow user inputed URLs in it.

For example if you are Facebook you allow users to paste URLs in their wall.
You might even allow URLs, to perform a Javascript or meta tag redirection.

Some developers might opt to use an HTML encoder, breaking many URLs.

More user conscious developers might opt to create very thorough regular expressions allowing most common characters, but that may prevent legit users from inputting real URLs.

So I decided to try to solve this problem, I googled a bit, but couldn't find the answer.
I found some discussions in Stackoverflow, and even some people asked how Stackoverflow actually did it, but no reference to a library or code.

In the past I've used OWASP ESAPI, and I usually recommended it in my classes.
Theres an encoder that I have never used, but thought the name, "encodeForUrl" was self descriptive.

Even the description sounds promising:

"Encode for use in a URL. This method performs URL encoding on the entire string."

So I tried it.

What I got after this sample URL:

"http://www.google.com/dir/?test=test&test2=test<script>alert(1);<ScRiPt>"

Was:

http%3A%2F%2Fwww.google.com%2Fdir%2F%3Ftest%3Dtest%26test2%3Dtest%3Cscript%3Ealert%281%29%3B%3CScRiPt%3E

of course the link if used in HTML such as

<a href="http%3A%2F%2Fwww.google.com%2Fdir%2F%3Ftest%3Dtest%26test2%3Dtest%3Cscript%3Ealert%281%29%3B%3CScRiPt%3E">

wont work.

If you look to the source code of the method it's just a call to the URLEncoder.encode method.
It's not a bug, I just misunderstood how that library works.

I could have continued googling or looking at the source code of maybe an open source forum such as phpbb, but I decided to take matters into own hands.

I decided to first make a validation of the URL, being harsh with how it's composed till the domain name finishes. I might left out some legit URLs (not even considered IDN).

if it passes this regex validation

^(https?:\\/\\/)([a-zA-Z0-9-_\\.]+)(:[0-9]{1,5})?((\\?|\\/)(.*))?$

It then URL encodes everything after the domains finishes and decodes certain special characters such as / ? = + & . ,

This double work might not be efficient, but I would rather blacklist everything and then make a whitelist of what I allow, than the other way around, where I cant forget to blacklist some dangerous character.

The code can be found here or below.

You are encouraged to use it and change it, but I'm not responsible for it.

This code assumes that the input is Canonicalized first, so no URL encoded input.

import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.net.URLEncoder;
 
 
public class encodeUrl {
 
    public static void main(String[] args) {
 
  String testUrl="http://www.google.com/dir/?test=test&test2=test+1,1.<script>alert(1);<ScRiPt>";
  
  if (args.length>0){
   testUrl=args[0];
  }
  
  
  System.out.println("in:  " + testUrl);
 
  String pattern = "^(https?:\\/\\/)([a-zA-Z0-9-_\\.]+)(:[0-9]{1,5})?((\\?|\\/)(.*))?$";
 
  Pattern r = Pattern.compile(pattern);
  Matcher m = r.matcher(testUrl);
 
 
  if (m.matches()){
   String out="";   
   out += (m.group(1)!=null) ? m.group(1): "";
   out += (m.group(2)!=null) ? m.group(2): "";
   out += (m.group(3)!=null) ? m.group(3): "";
   out += (m.group(5)!=null) ? m.group(5): "";   
   
   try {    
    out+=m.replaceFirst(URLEncoder.encode(m.group(6), "UTF-8")).replace("%2F","/").replace("%3F","?").replace("%23","#").replace("%3D","=").replace("%26","&").replace("%2B","+").replace("%2C",",").replace("%2E",".");
    System.out.println("out: " + out);
    System.out.println("just encode: " + URLEncoder.encode(testUrl, "UTF-8"));
   }
   catch (UnsupportedEncodingException e){
    System.err.println(e);
   }
 
  } 
  else {
   System.out.println("out: not a valid url");
  }
    }
}

After writing this post, I started thinking if all of this is needed, why don't I escape the quote and double quote characters, and make sure that HTML attributes are enclosed with them.
At the beginning that seemed to solve the problem in a much easier way.

But I thought of using the newline character %0A in the input (Firefox in OSx doesn't need the %0D for this attack to work)
Some thing like this:
"http://www.victim.com?param=%0A</script><script>alert(1);<script>"

In the response I got something like this:

<a href="http://www.victim.com?param=
</script><script>alert(1);<script>" >link</a>

(the browser does the newline, thats why it's 2 lines here)

Which triggered the "alert(1)".

I didn't test my encodeForUrl with the canonicalized version of
"http://www.victim.com?param=%0A</script><script>alert(1);<script>"
but it shoud look something like this

http://www.victim.com?param=%0A%3C/script%3E%3Cscript%3Ealert%281%29%3B%3Cscript%3E

which is harmless.
(there is no newline as before, it looks like that because of the Post width.)

Intro

Hi this is not an introduction of myself, rather its an introduction of this blog.

I'll try to stay true to the title of the blog, writing stuff about building secure code, improving web sites security, and a couple things about breaking them how to prevent it.
I'll also include some side projects regarding security monitoring, or whatever related to security.

What I want to make clear too is that all the code was written by myself during my free time, and that doesn't mean is being used at work. Most of the code are more like PoC than ready to use code.

When I state some opinion, it's my own, and does not reflect my employeers.

Enjoy!