A Story of a PHP Variable Reference Pitfall

Posted Aug 27, 2009 // 6 comments
Irakli:

I am one of the lucky people who stumbled upon PHP when it was still a little-known, home-grown, scripting language, many, many years ago. I immediately fell in love with it. There was something pure and wild about it that just did not exist in many other languages; kindof like: driving a manual vs an automatic car. The love was permanent and I’ve always tried to keep up with PHP even through my dark days of being .NET, TCL, Java etc. programmer at my day-jobs. Then a few years ago, I got lucky to make Drupal my day-job and finally I was back to enjoying PHP in full.

Having had an extensive PHP background, I thought that I had it pretty much nailed and little could surprise me. Certainly not something as “simple” as variable references, I thought. I was wrong.

As we all know, non-object variables in PHP5 are assigned by value. So if $a and $b are arrays and you do $b = $a, $b gets a copy of $a. Well, apparently not entirely true. What I discovered the hard way is that even when $b = $a puts a copy of $a in variable $b, if $a is an array that holds an object, somewhere down its structure, $b gets a reference to that object, not a copy of a value! Meaning, $b itself is a copy of $a, but one of its referenced variables, somewhere down the structure is actually a reference and will get affected if $a subsequently changes it.

To clarify the matter, consider this test code:

1
2
3
4
5
6
7
8
9
10
11
12
$a = array();
$a [ 0 ] = 'initial string';
$obj = new stdClass();
$obj->title = 'initial title';
$a [ 1 ] = $obj;
 
$b = $a;
$b [ 0 ] = 'modified string';
$b [ 1 ]->title = 'modified string';
 
print_r($a);
exit();

You would expect (at least, I did) that the result would be:

1
2
3
4
5
6
7
8
9
Array
(
    [0] => initial string
    [1] => stdClass Object
        (
            [title] => initial string
        )
 
)

since $b holds a copy of $a and can not possibly modify values in $a. True for any vairables, except the object value. The actual result you will get is:

1
2
3
4
5
6
7
8
9
Array
(
    [0] => initial string
    [1] => stdClass Object
        (
            [title] => modified string
        )
 
)

Notice how string value did not change, but the object one – did.

Unfortunately, all these was not quite as clear in my case as it is in the test code. After banging my head at this mystery for a while, I found the explanation in Mike Lively’s blog post:

“Reason for this is that any given variable holding an object in php does not technically hold an object. It holds a handle to that object “

To make things even worse, apparently the above rule (in PHP 5.2, to be more precise) transcends many boundaries that you would expect it to not transcend. One example: if a function returns value of a static variable from function’s scope, that holds an object, the variable itself is returned by value, but any object that the variable may hold – by reference! All kinds of nasty things can come out of this little-known pitfall. It is especially dangerous for a framework like Drupal, where complex variable structures are commonly held in static variables, in functions – to improve performance.

Lesson learned: PHP is, indeed, wild, but not always in a good way :)

About Irakli

Irakli is Director of Product Development at Phase2 Technology. His main responsibility is development of packaged, turn-key solutions using open-source technologies and cutting-edge semantic APIs.

Irakli has been an avid open-source ...

more >

Read Irakli's Blog

Comments

by Ken Whitesell (not verified) on Fri, 08/28/2009 - 07:53

Object references

PHP is by no means the only language to have this issue. All of Java, C#, Python and Perl exhibit this behavior as well. It's why you'll often see languages such as these having "deep copy" functions in addition to "shallow copy" operators.

by irakli on Fri, 08/28/2009 - 08:09

True, but in Java a = b is an

True, but in Java a = b is an assignment by reference, to begin with so everything is assigned by reference universally. I guess the real annoying and frustrating part, in my case, was that variables themselves got assigned by value, half of the elements got assigned by value and only objects got assigned by reference.

Also, it was not quite as simple of example as the one in the blog post.

That said, you have a good point. Thank you.

by Brad (not verified) on Fri, 08/28/2009 - 08:19

Yeah, Ken is spot on -- this

Yeah, Ken is spot on -- this is a "problem" in most languages with any sort of pointers/handles/refs. The reason it's a little more confusing in PHP is a combination of PHP's dynamic weak typing, which causes PHP programmers to be less attentive to the behavior of different types, and on-the-fly object definitions, which similarly causes PHP programmers overlook the distinct characteristics of objects.

by Cody Craven (not verified) on Fri, 08/28/2009 - 23:28

Thanks for your honesty

Irakli,

Thank you for your honesty about this point of PHP. While to me it makes sense (I began my programming career in PHP years ago) I can see where those coming from backgrounds with different languages could easily lose a lot of time trying to understand what is going on with their initial array.

by fawzi (not verified) on Fri, 11/20/2009 - 03:24

variable scope in drupal

hi Irakli,

your opening statement came up in my search and i was hopeful that it will end my long search for any info on variable scope under drupal.

it seems that any variable i use in my function that is declared as global (e.g. gobal $a;) is not recognized.

i am building a drupal module, and though my php function works correctly stand alone, when i import in drupal, i lose all the value of global variables.

do you know if there is a "proper" way of defining, or using global variables in drupal? or is it avoided altogether and replaced by function calls?

Apriciate it if you can shed some light on this,

regards,

by Forest Mars (not verified) on Sun, 06/20/2010 - 14:42

What exactly were you searching for…?

@fawzi - Not to be a wise guy but Iraki's opening statement was: I am one of the lucky people who stumbled upon PHP when it was still a little-known, home-grown, scripting language, many, many years ago.

What exactly were you searching for?

WRT your question, it sounds like you've have your global declarations too late for Drupal request handler to include them in the page load. Try putting them in your menu hook which is loaded early enough that they should be available.

As for a *proper* way of defining global variables I'd recommend looking at var_get() & var_set() and using those instead. You'll probably save yourself a lot of headache.

~ Forest Mars

ps @Iraki my captcha word is "performance" which seems like a remarkable coincidence…

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <strong> <code> <p> <img> <ul> <ol> <li> <h2> <h3> <h4> <b> <u> <i>
  • You may insert videos with [video:URL]

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.