How PHP Really Reads Your Code (Abstract Syntax Tree (AST))
Do you ever wonder how PHP parses your code? Wonder no longer because in this video, Scott Keck-Warren from the PHP Architect Channel shows you how to inspect your code using the AST with the PHP-Parser library.
How PHP Really Reads Your Code (Using ASTs)
When we pass our source code to the php
executable it doesn’t just get executed as is. It gets converted from text to an Abstract Syntax Tree (AST), then it’s converted to opcodes, and then finally it’s run. One of these steps, the AST, can be a powerful tool that can be used by tools to manipulate our code on a mass scale.
In this video/article, we’ll discuss how we can inspect our code using the AST with the PHP-Parser library.
Hello developers and welcome to the PHP Architect Channel! If this is your first time here my name is Scott Keck-Warren and on this channel, we discuss a wide variety of topics related to the PHP ecosystem. Make sure you subscribe to get our latest videos when they’re published.
What Is an Abstract Syntax Tree (AST)?
Imagine PHP is trying to understand your code like a human would read a sentence. If you write:
echo "Hello, world!";
PHP doesn’t just run that string of text as-is. It needs to parse the code which breaks it into smaller parts so it can understand the structure. It identifies keywords like echo
, strings like "Hello, world!"
, and punctuation like ;
and convert those pieces into a structured, tree-like representation of your code. That’s the Abstract Syntax Tree. It’s “abstract” because it doesn’t care about “irrelevant” details like whitespace or comments and instead focuses only on the meaningful structure of your code.
Here’s a visual of what the AST for our echo
statement looks like:
array(
0: Stmt_Echo(
exprs: array(
0: Scalar_String(
value: Hello, world!
)
)
)
)
At the top level, it’s an Expr_Echo
node (an “echo expression”) and it has a Scalar_String
node as its “input” which in this case is called exprs
or expressions.
Enter nikic/PHP-Parser
It’s a challenge to access the AST of our PHP Code directly so it would be helpful if we could access using something like the PHP code we’re used to. We can do just that using the nikic/PHP-Parser
library, an amazing library created by Nikita Popov. It lets you parse PHP code into an AST you can work within your PHP code.
While the full library is great for writing custom linters and static analysis tools today we’re just going to focus on the CLI tool it provides called php-parse
. php-parse
will allow us to output a text representation of the AST for our code. It’s useful to inspect the AST generated from any PHP code.
Installing php-parse
First, make sure you have Composer installed. Then, install nikic/php-parser
globally or as I’m going to do in a test project.
mkdir php-parser
cd php-parser
composer require nikic/php-parser
Once installed, the CLI tool will be available at vendor/bin/php-parse
.
Seeing Your First AST
Now let’s see what happens when we parse a file using parse-php
. You’ll do so by creating a test file called hello.php
with the following contents:
<?php
echo "Hello, world!";
If it looks familiar it’s because it’s the example we used before.
Now you can run:
./vendor/bin/php-parse hello.php
You’ll get output that looks something like this:
% ./vendor/bin/php-parse hello.php
====> File hello.php:
==> Node dump:
array(
0: Stmt_Echo(
exprs: array(
0: Scalar_String(
value: Hello, world!
)
)
)
)
This is a simple example, but the structure scales to complex code too.
We’ll discuss that more after this word from our sponsors.
A More Complex Example
Let’s say you have this example code of a function that generates a concatenated string:
<?php
function greet(string $name): void {
return "Hello, $name!";
}
echo greet("Alice");
Now run:
vendor/bin/php-parse greet.php
You’ll see output like:
====> File greet.php:
==> Node dump:
array(
0: Stmt_Function(
attrGroups: array(
)
byRef: false
name: Identifier(
name: greet
)
params: array(
0: Param(
attrGroups: array(
)
flags: 0
type: Identifier(
name: string
)
byRef: false
variadic: false
var: Expr_Variable(
name: name
)
default: null
hooks: array(
)
)
)
returnType: Identifier(
name: void
)
stmts: array(
0: Stmt_Return(
expr: Scalar_InterpolatedString(
parts: array(
0: InterpolatedStringPart(
value: Hello,
)
1: Expr_Variable(
name: name
)
2: InterpolatedStringPart(
value: !
)
)
)
)
)
)
1: Stmt_Echo(
exprs: array(
0: Expr_FuncCall(
name: Name(
name: greet
)
args: array(
0: Arg(
name: null
value: Scalar_String(
value: Alice
)
byRef: false
unpack: false
)
)
)
)
)
)
There are a couple of pieces I want to highlight in this output.
The first is how just 4 lines of code have ballooned to almost 70 of output so it’s not an ideal way to view our code. I would highly recommend you use php-parse
with a small example and not a 1000+ line file or you will be overwhelmed. I try to reduce my input to a minimal amount of code so I can quickly pick out the information I need.
The second is that our great()
function got converted to a Stmt_Function
with a bunch of attributes including its return type, the parameters that should be passed to it, and a series of statements (called stmts
). Our simple function just has a single statement which is to return (Stmt_Return
) a string (Scalar_InterpolatedString
) but more complex functions will have more statements.
This is important to note because this is where the tree concept can expand to have a truly intense number of nodes.
Why Should You Care?
Okay, so this is cool and all—but why should a “normal” PHP developer care about ASTs?
A few reasons:
- Better understanding of PHP internals – Knowing how your code is parsed helps you better troubleshoot your code because you can understand the order of operations.
- Learning opportunity – Peeking into the AST can be an eye-opener if you’re curious about how languages work.
- Static analysis and tooling – Tools like PHPStan, Psalm, and Rector use ASTs to analyze and transform code. You can extend these tools with a basic understanding of the AST, especially if you subscribe and watch our future videos.
What You Need to Know
- The abstract syntax tree (AST) is a representation of our code after it’s been parsed
- We can use
php-parse
to inspect the AST of our code - We can use the results from
php-parse
to help us extend PHPStand and Rector
Outro
I hope you enjoyed our video.
If so make sure you subscribe, comment, share, and like as it does help others find us
Are there topics you would like to see us cover? Let us know in the comments below or send me a message on any of social media platforms found at my website Scott.KeckWarren.com. We would love to hear how we can help you and it always brightens my day when I hear from a fan.
This is Scott Keck-Warren for the PHP Architect Channel signing off and reminding you to keep watching, keep coding, and keep reading.
More after this word from our sponsors