I really don’t like programming. I built this tool to program less so that I could just reuse code. - Rasmus Lerdorf
In 1994, Rasmus Lerdorf published the first release of what he called his “Personal Home Page Tools”, which later became known as PHP. Like Douglas Adams observed about the creation of the universe, this has a made a lot of people very angry and has widely been regarded as a bad move. Programmers around the world have been lamenting the shortcomings of PHP for as long as PHP has existed. An absurd amount of ink has been spilled on the topic of PHP’s design deficiencies, which more recently has been summarized as a “fractal of bad design”.
When it’s all said and done, though, I unapologetically love PHP. I do not love PHP the language; rather, PHP the tool and programming model nail many aspects of modern web development in a way that few systems today do:
- Requests to a PHP service share nothing by default and are completely isolated from one another. There is no such thing as global state. Success of languages and systems like Erlang/OTP have shown that there is true power in isolated heaps, coming in the form of bug prevention, fault tolerance, and performance.
- Requests are the unit of concurrency, not threads. Because no state is shared between requests, data-race style bugs are impossible. It is trivial to reason about global variables because there are no “true” globals that persist beyond the scope of a request.
- Requests provide a natural lifetime for all objects that are created during processing. Clever runtimes are itching to use this to their advantage: Go experimented with this in 2016 and .NET has toyed with a similar concept. Neither of these produced concrete performance wins due to Go and .NET’s inability to prevent sharing of data between threads - something PHP already does by construction.
The fact that PHP is as tremendously successful as it has been despite the significant shortcomings of the language implies that PHP the tool has touched on a few key truths that keep their users coming. As a lover of PHP the tool, I find this very frustrating. How can it be that we’ve gone this long knowing that PHP got some things very right without thinking hard about dropping the language that has held it back? 1 I’ve always loved programming languages and systems and so I sometimes daydream about what a successor PHP might look like; this blog post tries to explore this design space a bit.
PHP: Compilation vs. Interpretation
Like many languages, PHP has experienced multiple implementations throughout its lifetime. The main implementation of PHP is the Zend Engine, which is an interpreter. Through interpretation, PHP enjoys (regrets?) highly-dynamic features such as require
and include
2.
Other implementations have chosen to compile PHP. Facebook famously compiled their PHP to C++ prior to writing HHVM. PeachPie compiles PHP to .NET. Facebook ended up dropping ahead-of-time compilation of their PHP in favor of runtime translation via HHVM. In their own words:
Prior to HHVM, our development environments (we call them “sandboxes”) used a custom-built PHP interpreter called HPHPi to shortcut the long and slow HPHPc compilation cycle and provide a rapid “edit, save, run” development workflow. HPHPi was flexible but slow (slower than the Zend engine that HPHPc replaced).
The implication here was that compiling their PHP ahead-of-time sacrificed one of PHP’s incredible strengths: its edit-save-run cycle. Iterating on PHP is as simple as saving a file and refreshing a page in your browser. HPHPi
was slower than Zend, and yet this quality was so important that shipping it to developers was still a win!
The reality of compilation is that there will always be a non-zero amount of work that needs to happen between the time you save your code and the time you run your code. When I think about a language that preserves this great strength of PHP, I just can’t imagine being able to insert a compiler in-between a user’s save and run cycle that is fast enough to feel as instantaneous as PHP does. I also can’t imagine having to re-start a server every time I change a file.
I don’t think there’s a system out there today that has the same edit-save-run cycle speed as PHP:3
- NodeJS ate some of PHP’s lunch in the early 2010s by offering a quick edit-save-run cycle, by virtue of interpreting JavaScript and not compiling it. However, language deficiencies in JavaScript have caused the ecosystem to trend heavily towards compilation (e.g.
typescript
), even for vanilla JavaScript projects (e.g.webpack
,babel
). If you’re writing modern JavaScript on the server, there’s probably a compiler or two in your edit-save-run cycle somewhere. - Python and Ruby usually require server reloads when code changes. Highly-used frameworks such as Rails can do a limited form of hot reloading. Perhaps this is why these two languages enjoy such strong representation in the web world.
- Erlang/OTP is perhaps the closest to any language I’m aware of to the dream of zero-restart edit-save-run cycles. However, Erlang and PHP have different goals and Erlang’s original usage barely included dynamic web sites at all. Newer languages like Elixir on the Erlang runtime center dynamic web sites a little more, but the procedure for hot reloading is not simple like PHP.
- Go and all other compiled languages insert a compiler and linker into the cycle, which (although the Go compiler is quite fast) is still too slow, especially since you’ll probably have to restart your server anyway to run the new compiled binary.
Any system that wants to capture the magic of PHP needs to inherit the execution model of PHP; it needs to look like a new PHP process is executed for every request. It is not a requirement to actually launch a new process upon every request, but the fact that all you have to do to see your edited code in action is refresh your browser is an aspect of PHP that has been integral to its success.
PHP as a Web DSL
In many ways, PHP is a DSL for web applications. PHP is a lousy general-purpose programming language. Any language that aims for the same niche as PHP also doesn’t need to be particularly good at general-purpose programming; there are no shortage of languages that fulfill that niche.
Languages such as Dark and Ballerina stood as attempts to provide DSLs for cloud-based service development. Both of them are/were operating for-profit. However, programming languages have a tendency to cross-pollenate ideas, and it’s clear from the design of these languages that there are interesting language features oriented towards the development of applications within PHP’s niche:
- Dark has types for particularly sensitive forms of input, where a type-checker can reason about the sensitivity of particular values. Pyre, a static type-checker for Python (also by Facebook), tracks the “taintedness” of particular values for security analysis.
- Ballerina has a json type, which allows a typechecker to actually understand what is today a universal data interchange format.
- Facebook introduced XHP as an extension to PHP to reify XML within the language with the goal of producing HTML. ReactJS further refies XML for the purposes of HTML, this time in JavaScript. This extension was useful enough to end up in Hack as well. Visual Basic has had XML literals for a long time.
PHP’s unique execution model provides a good foundation for dynamic web pages, but it’s clear that there’s a lot more room to grow. Hack has a ton of language features on top of PHP such as the concurrent
block, the using
block for object disposal,
and async
/await
.
Unlike Hack, which serves as a gradual adoption target, there’s a lot of room for language R&D in this space that isn’t encumbered by PHP’s legacy language.
PHP as a Typed Language
The meteoric rise of TypeScript, as well as sibling typecheckers/languages Sorbet (Ruby), mypy (Python), Hack (PHP), and Raku have shown empirically that type systems provide an enormous benefit to programmers when writing programs in the large. Typechecking is particularly useful in programs exposed over the internet since type errors can quickly turn into operational and security headaches. Many of the features mentioned in the above section require a typechecker to reach their full potential (typing json
, taint analysis, etc.).
If we’re thinking about the future of PHP, we need to think about a solid typesystem and typechecker - one that efficient enough to run as part of the edit-save-run cycle, while powerful enough to typecheck the language in its entirity.
Summary
I’m not sure what the future of language in PHP’s niche looks like, but I feel excited by the possibilities of language R&D in this space. PHP is a terrible language and an outstanding tool at the same time. I want a tool that is both an outstanding language and an outstanding tool!
-
Facebook did do this, first with their custom PHP runtime HHVM and, later, their language Hack. In many ways Facebook has already laid out a significant body of research on the evolution of PHP. However, in doing so, they left their open-source users behind. Facebook realized that, as this blog post also argues, that PHP the language is holding them back and needs to be changed. It is unfortunate (though not unexpected given the nature of corporate OSS) that there is exactly one program written in Hack that Facebook cares about - the Facebook website. It is worth noting that Slack also uses Hack and HHVM. ↩︎
-
include
is the same asrequire
except thatrequire
issues an error on failure, instead of just a warning.include
is an excellent example of a tremendous unforced error in language design that plagues PHP to this day. No modern language would continue to execute if the import of another source file failed. ↩︎ -
I think that Dart has gotten pretty close to this, but I don’t have any experience with Dart and it seems like it’s primary role is to be the execution substrate of Flutter. ↩︎