SERVICES
 
CHANNELS

Sandcastles & Security

Posted by Keith Casey on July 27, 2010
IN Development
Tags: · ·
 

By the time I was 10, my mom stopped buying me electronic toys. While the first few days would go well, it was just a matter of time before I’d turn it into a mass of wires and circuitry. While I’d sometimes manage to turn it into something else, often it would merge with other masses of wires and circuitry. Unfortunately, I’m pretty sure Marco and Cal are on their way to a similar conclusion..

After playing with Flex off and on for a couple of months, I decided I would try to break it. I’m not a security guy at heart, but I’ve listened closely and improved my own stuff, so I quickly came up with four ways that I might be able to cause problems with Flex. Here are my results with each. To be clear, it is not my goal to be a nefarious troublemaker and break everything. My goal is to find out where things could break.

First, the easiest to get started with is general web access. If we can get a Flex application to load a specific URL, we might be able to make it misbehave. Out of the box, there are a few different ways to do this. We can use the URLLoader (immediately below) or if you’re using AIR, the HTMLLoader (second sample), but either way it’s just a handful of lines of code:

var loader:URLLoader = new URLLoader();
var path:String = txtURL.text;
var request:URLRequest = new URLRequest(path);
loader.addEventListener(Event.COMPLETE, captureContent);
loader.load(request);

This assumes another function – captureContent – is written but all it does is add the content into a TextArea. As far as I can find, this one is secure. The local client doesn’t execute anything in the content, it simply loads it as a string. In order for anything to be executed, the client would specifically have to parse it out. This seems like a lot of effort since if you can do that, you can try a simpler attack using ActionScript itself.

var container:Sprite;
var html:HTMLLoader = new HTMLLoader();
html.height = 400;
html.width = 400;
html.height = 600;
var urlReq:URLRequest = new URLRequest(“http://www.adobe.com/”);
html.load(urlReq);
container.addChild(html);

This sample was lifted from the documentation from the HTMLLoader documentation on Adobe’s site.

Since this can render HTML and even Javascript, it seems like it will be more successful. Luckily, there are a handful of things – namely eval(), setTimeOut(), setInterval(), and new function() declaration  – that are immediately blocked by the engine itself and even generate errors. More importantly, this executes within its own sandbox with even less permissions than the core engine and without access to any of the AIR API’s. This sandbox keeps its own browser history and cookies and is fully manipulable from the AIR libraries but not vice versa.

Next, let’s see what we can do closer to home with simple network access. If I can look at the resources on your network – computers, file shares, printers, coke machines – I may be able to scan them for vulnerabilities or steal the design documents for your BFG 9000. To get started, I went with their basic SocketMonitor class. With three lines of code, I was able to monitor network connectivity. While not my immediate goal, it seems like the easiest way to detect when I should work off a local cache or interact with the server. This will be useful in the web2project AIR client, but for now, back to my nefarious plans… when I attempted to scan the network or connect to specific machines with anything beyond a boring http request, I couldn’t accomplish anything useful. That is, until I discovered NativeProcess, but more on that later..

Next, let’s see what we can do to the filesystem. For a malicious troublemaker, this approach would be the most difficult – they’d have to get the user to download and run the application with the hope that AIR is installed – but it’d also be the most powerful. For someone like me just seeing what the limits are and where I might run into trouble, we’ll take it easy and not break everything. So let’s see what we can delete. With two lines of code – the trace statements are debugging – I was able to delete a key file:

var myFile:File = new File(‘C:/xampp/php/php.ini’);
trace(myFile.exists);
trace(myFile.nativePath);
myFile.deleteFile();

But maybe I don’t want to delete files and maybe just collect information. In that case, a simple myFile.load() will get the contents of the file.

Since I was previously more familiar with Flash in the browser, I was surprised digging deeper in the security sandbox clarified it. Just as any application installed in your local environment, an AIR application inherits its permissions from the user running it. While there’s not much to do to further lock AIR down, you should ensure that all applications from from sources you trust.

Next, we can attempt database access. While the SQLite library is built into AIR out of the box, connecting to MySQL or SQL Server is not possible without an additional library. More importantly, since the recommended way of “connecting” to one of these databases is through an XML or AMF-based service layer, this isn’t likely to be an effective attack. That said, interacting with SQLite is quite simple so you could attempt to connect to other SQLite databases for other applications. Fortunately, it’s almost trivial to encrypt your database in an AIR application, so this is a remote possibility at best.

Finally, after much investigation, I found the biggest opportunity for trouble: the NativeProcess class. This class – only available since AIR 2 SDK released last month – allows AIR to interact directly with other applications on the host operating system. You can start processes, kill process, execute shell scripts, or just do things on the command line in general. Obviously, this can be trouble. As a simple test, I started a new process within an infinite loop and watched the debugger crawl. To finish the earlier thought, once you have command line access, you can scan the network, connect to file shares, and even interact with printers. Since these commands will have to be operating system specific, it would be difficult to build a general purpose tool, but you could accomplish some creative tasks.

Of course, we can behave ourselves and use this to create amazing things too. There’s even an example on creating your own screen recorder in AIR 2..

Overall, I was hoping to break Flex and see where I might be able to cause some problems or protect myself from problems. I found a few places – like with the HTML processing and filesystem access – that could be problematic and should be handled with care, but realistically, a Flex/AIR application behaves exactly like you’d expect any desktop application to behave. Granted, as web developers we’ve rarely had to worry about that, but it’s something we need to consider and get used to as we work on the desktop more and more.


About the author—D. Keith Casey, Jr. has been a PHP developer for about seven years and a professional agitator within the local Washington, DC tech community for a few years longer. To pay the bills, he works as the CTO of Blue Parabola, LLC on large-scale PHP-based systems for organizations ranging from major news media companies to small non-profits. In his spare time, he is a core contributor to web2project, works to build and support the DCPHP community and BarCampDC community, and blogs regularly on technology issues.
 
 
 

PHP 5.3.3 and 5.2.14 are out

Posted by Giorgio Sironi on July 26, 2010
IN Development ·News
Tags: · · ·
 

July 22 has seen the release of two new versions of PHP from the main development lines — the innovator 5.3.x and the previous 5.2.x. There are interesting news for these new releases, which are both stable ones.

PHP 5.3.3

This release contains one incompatible change, but note that this BC break is only relevant to namespaced classes. This makes PHP 5.3.3 slightly backward incompatible with previous PHP 5.3 versions at most, but fixing the issue now at the start of the adoption period of 5.3.x was probably the best solution.

Basically, methods named like the class are not considered constructors in namespaced classes:

<?php
namespace Bar;

class Foo
{
public function Foo() {} // not a constructor anymore
}

The previous behavior was causing problems, for example, with Zend Framework view helpers. These classes had an API method named as the base name of each helper class. With the migration to namespaces, the basename became the name of a constructor (which logically should have been the fully qualified name of the class; in the case above Bar/Foo, an illegal method name).

The fix is backward incompatible but clears the picture and targets only the __construct() method as the constructor for namespaced classes. Zend Framework 2 will probably stop using method named as so to favor a conventional name like direct(), but this change actually simplifies the porting of PHP 5 code to namespaces by removing a possible clash of method names with the constructor.

PHP 5.3.3 is also a maintenance release which corrects many bugs on memory corruption and buffer overflows. It also features the upgrade of the PCRE and sqlite extensions.

PHP 5.2.14

This release is a maintenance release for the old 5.2.x branch, which marks the end of the active support for this line of development.

This means bug fixing won’t be available for PHP 5.2 anymore, and only security fixes will be evaluated for inclusion from now on, on a case by case basis. PHP 5.2 is essentially being faded out for the adoption of the faster and powerful 5.3 version.

Personally I’m glad the process for upgrading from 5.2 to 5.3 has already been started, given the previous experience with PHP 4. PHP 4 end of life period was extended until it reached the final deadline of 8-8-8 (August 8th, 2008), and it is still around on some servers nowadays.

If you use PHP 5.2, there is a migration page to help transitioning your applications to PHP 5.3 which lists the few incompatible changes you’ll have to address. Most of old PHP code already works out of the box in PHP 5.3, and you’ll obtain helpful new features like namespaces and anonymous functions.


About the author—Giorgio Sironi is a freelancer developer and Bachelor of Science in Computer Engineering cum laude. He now focuses on contributing to PHP open source projects and blogs regularly at Invisible to the eye.
 
 
 

ORMs and relational databases: powerful tools or dumb ideas?

Posted by Giorgio Sironi on July 22, 2010
IN Opinion
Tags:
 

A recent article on Object-Relational Mappers and the rising of non-relational databases has spawned a lot of comments on their benefits and drawbacks.

ORMs have been given many nicknames over the years, like the Vietnam of computer science. In the opinion of the author, ORMs are dumb, as it is a fundamentally broken idea to store an object graph into a relational database.

ORMs and ODMs

Are ORMs so unuseful? It seems that JBoss’s Hibernate, the first generic ORM worth its name, was a revolution for the Java world. To the point that the PHP open source community is following up by creating Doctrine 2 as an Hibernate for PHP.

It’s no mystery that sometimes ORM are a leaky abstraction and you’ll have to write a bit of SQL from time to time in critical points, to enhance their performance. But this is true also for every NoSQL store when used in companion of an object model.

For example, consider the MongoDB Object Document Mapper project started by Jonathan Wage, one of the core Doctrine developers. It is the equivalent of Doctrine, but it uses MongoDB, a non-relational database, as the target storage for an object graph. Here you have to do a mapping anyway, only it is performed by an ODM instead of an ORM.

How to persist an object graph

The only pure solution to store an object graph for persistence purposes is to do a periodical dump of the RAM (not always viable), or to run an object-oriented database as the primary storage mechanism.

Object-oriented databases raise lots of issues, apart from the poor support by hosting services. They’re specific to a programming language, and often violate the encapsulation of the single objects. Ideally, you shouldn’t be able to access private properties of objects to make queries on them.

It seems that an object-oriented model is not so fond of being persisted anyway: it is based on graph traversal local algorithms (and the logic of where to go in the graph is distributed in the methods of the various objects). Databases in general are instead based on declarative languages such as SQL or its ORM-spiced versions like the Hibernate or Doctrine Query Language.

By the way, being able to think in objects gives many advantages to a developer, mainly easiness of test, and freedom of modeling. Every PHP framework is object-oriented nowadays, even if PHP has historically been behind other languages in the adoption of this paradigm.

At the same time, databases are good at storing data and perform bulk operations on it directly. Every ORM allows to define UPDATE or DELETE queries which act on a large data set, but do not result necessarily in the reconstitution of a large part of the object graph.

Conclusions

Thus, ORMs meet many use cases for developers that want a level of abstraction over relational data stores and to avoid writing boilerplate code for data translation. Other types of Data Mappers suitable for NoSQL databases like MongoDb are growing. Do you think we will see the success or the failure of these solutions in the near future?


About the author—Giorgio Sironi is a freelancer developer and Bachelor of Science in Computer Engineering cum laude. He now focuses on contributing to PHP open source projects and blogs regularly at Invisible to the eye.
 
 
 

PHPDOCX: generating Word documents from PHP

Posted by Giorgio Sironi on July 21, 2010
IN News
Tags: · · · ·
 

PHPDOCX is a PHP library that allows its client code to generate Microsoft Word documents in the .docx format from PHP scripts. PHP is increasingly being used for disparate goals and has to deal with data that comes from strange sources and has to be produced in stranger formats. An off-the-shelf solution for the creation of Word documents from an arbitrary source — being it a database, Excel or a csv file — is indeed a good tool to keep at hand.

Starting with the 1.5 version, which has been released on July 12th, PHPDOCX is now compatible with PHP 5.3. The adoption of PHP 5.3 from operating systems is growing and it will at last replace the previous versions of PHP also in the servers of hosting providers.

Features

PHPDOCX provides some standard features that you would commonly use when generating a document dynamically: managing text, list, tables, images and graphic elements are all basic operations of document editing.

There are more useful features included in the library, which come handy when dealing with long documents. For instance, insertion of headers, footers, page numbering, and table of contents are all supported.

A final note on the feature list is the possibility of outputting PDF and HTML from a given Word document. The library is intended for generation of reports and being able to switch the output format at will is a great point.

Technicalities

PHPDOCX has no requirements for a functional version of MS Word, except for generating legacy versions of the documents (.doc format for Word 2004 or before).

The library does requires the zip and xsl PHP extension to work, but they are probably already installed on your server of choice, or available at will. Apart from that, a generic installation of PHP and Apache will suffice.

Licenses

Like many libraries for web development, PHPDOCX comes with more than one license.

The first possibility is to use the library with an LGPL license, which covers the free version. It has somewhat limited features in comparison to the Pro one, but it includes no watermarks in the produced documents nor it has time limits.

The Pro version has greater capabilities, like the insertion of graphs and MathML constructs for scientifical documents. It also provides technical support, which may be the most compelling point for its adoption.

In conclusion, PHPDOCX is a valid tool to manage production of documents in one of the most diffused formats of the world. It also manages PDF and HTML, which guarantee interoperability with any end user’s machine.


About the author—Giorgio Sironi is a freelancer developer and Bachelor of Science in Computer Engineering cum laude. He now focuses on contributing to PHP open source projects and blogs regularly at Invisible to the eye.
 
 
 

Ext4Yii, bridging PHP and JavaScript frameworks together

Posted by Giorgio Sironi on July 15, 2010
IN Development
Tags: · · · ·
 

Ext JS

Ext JS is a (client-side) JavaScript library created for coding Rich Internet Applications. Nowadays no one uses bare JavaScript, without an abstraction layer over the browser JavaScript implementation. Along with JQuery, Dojo, MooTools and many others, Ext JS provides JavaScript developers with a reusable toolset for modifying the DOM, building form controls, or widgets like grids, tab panels and trees.

There has been some hype lately on Ext JS being renamed to Sencha, but it was a rebranding of the company behind this product, which proposes other libraries like Raphael (SVG-related). Ext JS is still available as a standalone product on Sencha’s website. Ext JS is released for free with its open source license adapt to other open source applications, but also as a commercial product, for commercial use.

Yii framework

Yii Web Programming Framework is a PHP 5 framework, mainly inspired by Prado. The goals of Yii are striving for high performance, achieved by simplicity of the design, and very loose coupling between the various components. Some of the features of Yii include the classic MVC implementation, database access infrastructure, and support for theming, caching and localization.

Yii is released under the BSD license, like Zend Framework, so there are no limitations to its use in open source or commercial projects.

Ext4Yii

Ext4Yii is essentially a bridge between these two projects, which aims to integrate Ext JS as an extension for Yii.

Ext4Yii is implemented as a templating system, which consumes XML models where you define widget elements like buttons or handlers (in embedded JavaScript code which can take advantage of Ext-powered methods and objects).

When using the provided ExtController as the superclass of the Yii application’s SiteController, the XML models are readily integrated and instantiated as Ext JS objects, without the need to include lots of .js files and event hooks.

In the Ext4Yii documentation, there are various examples of bootstrapping Ext JS widgets using these higher-level models, which deal with layout elements and forms. There are more advanced features included like AjaxMethods, which allows to call a registered PHP function via AJAX from the client side.

Learning a new modeling language can be complex, especially with verbose XML. Fortunately, there is a bundled plug-in for the NetBeans IDE which provides code completion for Ext4Yii. Even if you’re not a fan of IDEs, they sometimes are very handy in providing a rapid solution to bootstrap new tools like Ext4Yii.

Conclusion

PHP frameworks have been one of the most powerful innovations in the PHP landscape in the last years, and bridging them together with a JavaScript solution is important for being able to provide RIA behavior out of the box. There are similar successful solutions available on the web, like sfJqueryPlugin or Zend_Dojo, which prove that the concept of leveraging a JavaScript library into a PHP framework is an integration worth considering.


About the author—Giorgio Sironi is a freelancer developer and Bachelor of Science in Computer Engineering cum laude. He now focuses on contributing to PHP open source projects and blogs regularly at Invisible to the eye.
 
 
 

Modsecurity: Why it matters to PHP

Posted by Orlando Medina on July 9, 2010
IN News
Tags: · · ·
 

There is a new book released that should be in the libraries of web application developers everywhere. The title? ModSecurity Handbook:The Complete Guide to the Popular Open Source Web Application Firewall by Ivan Ristic. What is ModSecurity in the first place? Why does it matter to you? What makes this book important to the practice of web application design?

ModSecurity is a web application firewall. It can live in and out of the Apache web server environment, one of the most popular web servers around. ModSecurity is infinitely customizable and extremely powerful. The philosophy of ModSecurity can be summed up in a few words. Look, and only modify if I tell you to. It sports a custom rule engine that makes it extremely powerful. The syntax takes a little bit of work to wrap your head around, but the learning curve is not terrible. It’s an efficient system that aims  to cut out unnecessary logic and expressions and focus solely on the job of security. That being said, the rule language is rich and extensible. It is quite possible to make use of external scripts (such as php) to do specific security tasks. Additionally, the use of Lua is extremely useful. According to the author, the rule system will cover about 80% of the needs for most tasks. The last 20% or so where you need a ‘real’ programming language is covered by Lua and its tight integration with ModSecurity. Now, as a disclaimer, ModSecurity is not an excuse to make you a lazy programmer.  You still need to use good, secure programming practices to make your clients’ applications secure as well as useful.

Now, the book. Why is this book so important? It is THE source for ModSecurity if you care at all about the application. This book covers everything from download and install to configuration and to creating your own rule sets. Additionally, this book was written by the author  that created ModSecurity, Ivan Ristic. The book reads like your best programmer friend sitting right next to you guiding you as to what to do step by step. I am going to be extremely honest with you though, ModSecurity isn’t the easiest thing in the  world to implement at first glance, but the rewards are  well worth it. This book teaches you step by step how to reap those rewards and build a reasonably secure system for your clients. Seeing the steps on how to block basic attacks such as XSS attacks, and brute force attacks were intriguing and educational. It made me think about how I could implement these same techniques into my programming. Additionally, the comprehensive reference manual was a great touch and welcome addition. A lot of books just give tutorials, but sometimes a simple paragraph or bullet point is needed to explain a component.

The book itself takes some time and digestion. I am convinced that this book needs more than one read to get all the benefits from it. That being said, the additional reads will make you a better programmer and put you ahead the pack.  Feisty Duck publishes a hardcopy of the book and a digital version.


About the author—I am owner and lead software engineer of Medina Labs. I have several years of programming experience as well as the academics behind it.
 
 
 

Never Use $_GET Again

Posted by Matt Butcher on July 8, 2010
IN Development
Tags: · ·
 

You don’t need to use $_GET or $_POST anymore. In fact, you probably shouldn’t use $_GET and $_POST anymore. Since PHP 5.2, there is a new and better way to safely retrieve user-submitted data.

How many times have we heard about security issues in PHP applications stemming from unescaped GET and POST parameters? Proper escaping of input is a perennial problem with web development in general, and for whatever reason PHP seems to have had more than its fair share of bad publicity on this front.

On the database side, many worries over SQL injection have been squelched. The clever developers of PDO, for example, have constructed a library that analyzes data and escapes it appropriately. But the problem of validating and sanitizing input is still a substantial issue. To my surprise, many seasoned PHP developers still spend precious development cycles building custom code to filter input.

Why is this surprising? Because PHP (from 5.2 onward) has a built-in filtering system that makes the tasks of validating and sanitizing data trivially easy. Rather than accessing the $_GET and $_POST superglobals directly, you can make use of PHP functions like filter_input() and filter_input_array(). Let’s take a quick look at an example:

<?php
$my_string = filter_input(INPUT_GET, ‘my_string’, FILTER_SANITIZE_STRING);
?>

The code above is roughly the equivalent of retrieving $_GET[‘my_string’] and then running it through some sort of filter that strips HTML and other undesirable characters. This represents data sanitization, one of the two things that the filtering system can do. These are the two tasks of the filtering system:

  • Validation: Making sure the supplied data complies with specific expectations. In this mode, the filtering system will indicate (as a boolean) whether or not the data matches some criterion.
  • Sanitizing: Removing unwanted data from the input and performing any necessary type coercion. In this mode the filtering system returns the sanitized data.

By default, the filter system provides a menagerie of filters ranging from validation and sanitization of basic types (booleans, integers, floats, etc.) to more advanced filters which allow regular expressions or even custom callbacks.

The utility of this library should be obvious. Gone are the days of rolling our own input checking tools. We can use a standard (and better performing) built-in system.

But I would take things one step further than merely presenting this as an option. I would go so far as to say that we should no longer directly access superglobals containing user input. There is simply no reason why we should. And the plethora of security issues related to failure to filter input provides more than sufficient justification for my claim. Always use the filtering system. Make it mandatory.

“But,” one might object, “what if I don’t want my data filtered?” The filtering system provides a null filter (FILTER_UNSAFE_RAW). In cases where the data needn’t be filtered (and these cases are rare), one ought to use something like this:

<?php
$unfiltered_data = filter_input(FILTER_GET, ‘unfiltered_data’, FILTER_UNSAFE_RAW);
?>

I don’t suggestion this out of madness or fanaticism. Following this pattern provides a boon: I can very quickly discover all of the unfiltered variables in my code by running a simple find operation looking for the FILTER_UNSAFE_RAW constant. This is much easier than hunting through calls to $_GET to find those that are not correctly validated or sanitized. Risky treatment of input can be managed more efficiently by following this pattern.

Filters won’t solve every security-related problem, but they are a tremendous step in the right direction when it comes to writing safe (and performant) code. It’s also simpler. Sure, the function call is longer, but it relieves developers of the need to write their own filtering systems. These are darn good reasons to never use $_GET (or $_POST and the others) again.


About the author—Matt Butcher is a Senior Developer in the About.com division of the New York Times Company. He has written six books and numerous articles. Matt is the maintainer of the QueryPath PHP library and active in the Drupal community. Along with blogging for php|architect, he regularly posts his thoughts at TechnoSophos. He lives with his wife and daughters in Chicago.