A Story of a PHP Variable Reference Pitfall
I am one of the lucky people who stumbled upon PHP when it was still a little-known, home-grown, scripting language, many, many years ago. I immediately fell in love with it. There was something pure and wild about it that just did not exist in many other languages; kindof like: driving a manual vs an automatic car. The love was permanent and I’ve always tried to keep up with PHP even through my dark days of being .NET, TCL, Java etc. programmer at my day-jobs. Then a few years ago, I got lucky to make Drupal my day-job and finally I was back to enjoying PHP in full.
Having had an extensive PHP background, I thought that I had it pretty much nailed and little could surprise me. Certainly not something as “simple” as variable references, I thought. I was wrong.
As we all know, non-object variables in PHP5 are assigned by value. So if $a and $b are arrays and you do $b = $a, $b gets a copy of $a. Well, apparently not entirely true. What I discovered the hard way is that even when $b = $a puts a copy of $a in variable $b, if $a is an array that holds an object, somewhere down its structure, $b gets a reference to that object, not a copy of a value! Meaning, $b itself is a copy of $a, but one of its referenced variables, somewhere down the structure is actually a reference and will get affected if $a subsequently changes it.
To clarify the matter, consider this test code:
1 2 3 4 5 6 7 8 9 10 11 12 | $a = array(); $a [ 0 ] = 'initial string'; $obj = new stdClass(); $obj->title = 'initial title'; $a [ 1 ] = $obj; $b = $a; $b [ 0 ] = 'modified string'; $b [ 1 ]->title = 'modified string'; print_r($a); exit(); |
You would expect (at least, I did) that the result would be:
1 2 3 4 5 6 7 8 9 | Array ( [0] => initial string [1] => stdClass Object ( [title] => initial string ) ) |
since $b holds a copy of $a and can not possibly modify values in $a. True for any vairables, except the object value. The actual result you will get is:
1 2 3 4 5 6 7 8 9 | Array ( [0] => initial string [1] => stdClass Object ( [title] => modified string ) ) |
Notice how string value did not change, but the object one – did.
Unfortunately, all these was not quite as clear in my case as it is in the test code. After banging my head at this mystery for a while, I found the explanation in Mike Lively’s blog post:
“Reason for this is that any given variable holding an object in php does not technically hold an object. It holds a handle to that object “
To make things even worse, apparently the above rule (in PHP 5.2, to be more precise) transcends many boundaries that you would expect it to not transcend. One example: if a function returns value of a static variable from function’s scope, that holds an object, the variable itself is returned by value, but any object that the variable may hold – by reference! All kinds of nasty things can come out of this little-known pitfall. It is especially dangerous for a framework like Drupal, where complex variable structures are commonly held in static variables, in functions – to improve performance.
Lesson learned: PHP is, indeed, wild, but not always in a good way :)




Comments
Object references
PHP is by no means the only language to have this issue. All of Java, C#, Python and Perl exhibit this behavior as well. It's why you'll often see languages such as these having "deep copy" functions in addition to "shallow copy" operators.
True, but in Java a = b is an
True, but in Java a = b is an assignment by reference, to begin with so everything is assigned by reference universally. I guess the real annoying and frustrating part, in my case, was that variables themselves got assigned by value, half of the elements got assigned by value and only objects got assigned by reference.
Also, it was not quite as simple of example as the one in the blog post.
That said, you have a good point. Thank you.
Yeah, Ken is spot on -- this
Yeah, Ken is spot on -- this is a "problem" in most languages with any sort of pointers/handles/refs. The reason it's a little more confusing in PHP is a combination of PHP's dynamic weak typing, which causes PHP programmers to be less attentive to the behavior of different types, and on-the-fly object definitions, which similarly causes PHP programmers overlook the distinct characteristics of objects.
Thanks for your honesty
Irakli,
Thank you for your honesty about this point of PHP. While to me it makes sense (I began my programming career in PHP years ago) I can see where those coming from backgrounds with different languages could easily lose a lot of time trying to understand what is going on with their initial array.
variable scope in drupal
hi Irakli,
your opening statement came up in my search and i was hopeful that it will end my long search for any info on variable scope under drupal.
it seems that any variable i use in my function that is declared as global (e.g. gobal $a;) is not recognized.
i am building a drupal module, and though my php function works correctly stand alone, when i import in drupal, i lose all the value of global variables.
do you know if there is a "proper" way of defining, or using global variables in drupal? or is it avoided altogether and replaced by function calls?
Apriciate it if you can shed some light on this,
regards,
What exactly were you searching for…?
@fawzi - Not to be a wise guy but Iraki's opening statement was:
I am one of the lucky people who stumbled upon PHP when it was still a little-known, home-grown, scripting language, many, many years ago.What exactly were you searching for?
WRT your question, it sounds like you've have your global declarations too late for Drupal request handler to include them in the page load. Try putting them in your menu hook which is loaded early enough that they should be available.
As for a *proper* way of defining global variables I'd recommend looking at var_get() & var_set() and using those instead. You'll probably save yourself a lot of headache.
~ Forest Mars
ps @Iraki my captcha word is "performance" which seems like a remarkable coincidence…
Post new comment