views:

1289

answers:

5

Hi guys,

I know that include, isset, require, print, echo, and some others are not functions but language constructs.

Some of these language constructs need parentheses, others don't.

require 'file.php';
isset($x);

Some have a return value, others do not.

print 'foo'; //1
echo  'foo'; //no return value

So what is the internal difference between a language construct and a built-in function?

Thx for any advice! :-)

+4  A: 

After wading through the code, I've found that php parses some of statements in a yacc file. So they are special cases.

(see Zend/zend_language_parser.y)

Apart from that I don't think that there are other differences.

terminus
A: 

You can override built-in functions. Keywords are forever.

Jason S
That's not a built-in function. Is defined in the APD (Advanced PHP Debugger) extension.
Ionuț G. Stan
about overriding functions, you could have a loot at the runkit extension (it's not core either, it's an extension, so doesn't answer to the OP, but only to this answer) ; it's really powerful, and more recent than APD (and I believe I heard some time ago that some people were still working on it, even if it's not shown on pecl.php.net)
Pascal MARTIN
+7  A: 

Language constructs are provided by the language itself (like instructions like "if", "while", ...) ; hence their name.

One consequence of that is that they are faster to call than functions (or so I've heard/read several times)

I do not know how it's done, but one thing they can do (because of being integrated directly into the langage) is "bypass" some kind of error handling mecanism. For instance, isset can be used with not-existing variables without causing a Notice :

function test($param) {}
if (test($a)) {
    // Notice: Undefined variable: a
}

if (isset($b)) {
    // No notice
}

Note it's not the case for all of the languages constructs.

Another difference between functions and language constructs is that some of those can be called without parenthesis, like a keyword.

For instance :

echo 'test'; // language construct => OK

function my_function($param) {}
my_function 'test'; // function => Parse error: syntax error, unexpected T_CONSTANT_ENCAPSED_STRING

Here too, it's not the case for all language constructs.

I suppose there is absolutly no way to "disable" a language construct : it is part of the language itself. On the other hand, lot's of "builtin" PHP functions are not really builtin, and are provided by extensions -- that, for some, are always activated (but not all of them)

Another difference is that language constructs can't be used as "function pointers" (I mean, callbacks, for instance) :

$a = array(10, 20);

function test($param) {echo $param . '<br />';}
array_map('test', $a);  // OK (function)

array_map('echo', $a);  // Warning: array_map() expects parameter 1 to be a valid callback, function 'echo' not found or invalid function name

I don't have any other idea coming to my mind right now... and I don't know much about the internals of PHP... So that'll be it right now ^^

If you don't get much answers here, maybe you could ask this to the mailling-list internals (see http://www.php.net/mailing-lists.php ), where there are many PHP core-developpers ; they are the ones who would probably know about that stuff ^^

(And I'm really interested by the other answers, btw ^^ )

As a reference : list of keywords and language constructs in PHP

Pascal MARTIN
You can have a function that accepts a not-set variable without generating a notice by taking the variable by reference. This is not limited to language constructs like isset().
Tom Haigh
Oh, didn't think about that :-( Thanks!
Pascal MARTIN
+18  A: 

(This is longer than I intended; please bear with me.)

Most languages are made up of something called a "syntax": the language is comprised of several well-defined keywords, and the complete range of expressions that you can construct in that language is built up from that syntax.

For example, let's say you have a simple four-function arithmetic "language" that only takes single-digit integers as input and completely ignores order of operations (I told you it was a simple language). That language could be defined by the syntax:

// The | means "or" and the := represents definition
$expression := $number | $expression $operator $expression
$number := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
$operator := + | - | * | /

From these three rules, you can build any number of single-digit-input arithmetic expressions. You can then write a parser for this syntax that breaks down any valid input into its component types ($expression, $number, or $operator) and deals with the result. For example, the expression 3 + 4 * 5 can be broken down as follows:

// Parentheses used for ease of explanation; they have no true syntactical meaning
$expression = 3 + 4 * 5
            = $expression $operator (4 * 5) // Expand into $exp $op $exp
            = $number $operator $expression // Rewrite: $exp -> $num
            = $number $operator $expression $operator $expression // Expand again
            = $number $operator $number $operator $number // Rewrite again

Now we have a fully parsed syntax, in our defined language, for the original expression. Once we have this, we can go through and write a parser to find the results of all the combinations of $number $operator $number, and spit out a result when we only have one $number left.

Take note that there are no $expression constructs left in the final parsed version of our original expression. That's because $expression can always be reduced to a combination of other things in our language.

PHP is much the same: language constructs are recognized as the equivalent of our $number or $operator. They cannot be reduced into other language constructs; instead, they're the base units from which the language is built up. The key difference between functions and language constructs is this: the parser deals directly with language constructs. It simplifies functions into language constructs.

The reason that language constructs may or may not require parentheses and the reason some have return values while others don't depends entirely on the specific technical details of the PHP parser implementation. I'm not that well-versed in how the parser works, so I can't address these questions specifically, but imagine for a second a language that starts with this:

$expression := ($expression) | ...

Effectively, this language is free to take any expressions it finds and get rid of the surrounding parentheses. PHP (and here I'm employing pure guesswork) may employ something similar for its language constructs: print("Hello") might get reduced down to print "Hello" before it's parsed, or vice-versa (language definitions can add parentheses as well as get rid of them).

This is the root of why you can't redefine language constructs like echo or print: they're effectively hardcoded into the parser, whereas functions are mapped to a set of language constructs and the parser allows you to change that mapping at compile- or runtime to substitute your own set of language constructs or expressions.

At the end of the day, the internal difference between constructs and expressions is this: language constructs are understood and dealt with by the parser. Built-in functions, while provided by the language, are mapped and simplified to a set of language constructs before parsing.

More info:

  • Backus-Naur form, the syntax used to define formal languages (yacc uses this form)

Edit: Reading through some of the other answers, people make good points. Among them:

  • A language builtin is faster to call than a function. This is true, if only marginally, because the PHP interpreter doesn't need to map that function to its language-builtin equivalents before parsing. On a modern machine, though, the difference is fairly negligible.
  • A language builtin bypasses error-checking. This may or may not be true, depending on the PHP internal implementation for each builtin. It is certainly true that more often than not, functions will have more advanced error-checking and other functionality that builtins don't.
  • Language constructs can't be used as function callbacks. This is true, because a construct is not a function. They're separate entities. When you code a builtin, you're not coding a function that takes arguments - the syntax of the builtin is handled directly by the parser, and is recognized as a builtin, rather than a function. (This may be easier to understand if you consider languages with first-class functions: effectively, you can pass functions around as objects. You can't do that with builtins.)
Tim
wow! thanks for this great answer!
Philippe Gerber
A: 

Hey! Thanks all guys.. the post is really interesting and i have learned a lot from this.

I know that echo will accept multiple parameters where print doesn't.

I have experimented something like below:

$val = 100; echo("Lalith"), "   is a good person $val times", (" than dhaval"); print('
Always it prints me');

It hasn't thrown any error rather it gives the expected output.

What is the behavior of echo. whether it is a language constructor or a built-in function or both?

Lalith
if you have a question, you need to ask your own question, not post in someone else's thread
SilentGhost