7

The Scope of Variables
and Handlers

In this chapter, when we speak of variables and handlers, we mean local variables and local handlers. The word local signifies that these variables and handlers are "owned" by the scripts in which they appear, as opposed to database entries, which are global (accessible to any script).1

We have met a local variable in Example 4-1:

s = "Hello, World!"

dialog.notify(s)

When we press this script's Run button, the script starts executing, and in the first line, the variable s is created. In the second line the same variable s is used. The script comes to an end, and s goes out of existence. No other script can access this s unless this script, while running, explicitly calls the other script in such a way as to provide such access (we shall learn in Chapter 8, Addresses , how this is done).

We have met a local handler in Example 5-3:

on sayHi ()

    local (theDay, theMonth, theYear, theHour, theMin, theSec)

    date.get (clock.now(), @theDay, @theMonth, \

        @theYear, @theHour, @theMin, @theSec)

    if theHour >= 12

        dialog.notify ("Good afternoon or evening.")

    else

        dialog.notify ("Good morning.")

sayHi()

When we press this script's Run button, the script starts executing, and in the first line, the verb sayHi() is created. In the second summit-level line the same verb sayHi() is called. The script comes to an end, and sayHi() goes out of existence. No other script can access sayHi() unless this script, while running, explicitly calls the other script in such a way as to provide such access.2

This chapter explains how Frontier distinguishes between references to local variables and references to database entries. It also discusses how and when a local variable or handler comes into existence and goes out of existence, and what areas of a script can "see" a given local variable or handler; such matters are termed the scope of the variable or handler.

local

Variables and database entries share the same namespace. This means that every time Frontier encounters an object reference, it must make a decision as to whether what is being referred to is a variable or a database entry. It is important to understand how Frontier makes this decision; otherwise, unexpected results can occur.

For instance, in Example 4-1, s is taken by Frontier to be a variable, not a database entry. But that's just because we were lucky in our choice of name. As we saw in the previous chapter, had we chosen to call our variable cr instead of s, executing the script would have changed the value of an important UserTalk constant, system.verbs.constants.cr, and scripts referring to this constant henceforth would have broken. There is thus a danger of accidentally harming the database, because variables and database entries share the same namespace.

How can the database be protected from such accidents? One way might be to memorize the name of every entry along the search paths, and avoid using any of those names as variable names; but this is impractical and unnecessary. UserTalk provides a means of specifying that a particular name is to be understood as referring to a local variable and not to a database entry. It is called "declaring a variable", and it is done with the local keyword. For instance, in the following example, cr is taken to be a variable throughout, and the database is untouched.

*Example 7-1* *Local declarations protect the database*
local (cr)
cr = "Hello, World!"
dialog.notify (cr)

What declaring a variable with local really does is to bring the variable into existence. I shall now explain why this makes such a difference.

In the previous chapter, we saw how Frontier resolves an object reference; but I omitted from the description of this process its very first step, and also its very last step.

The first step in the resolution of references is this: before ever trying to resolve a name as a reference to a database entry, Frontier first examines the name to see if a variable by that name exists. If so, the name is resolved as referring to that variable.

Now we understand how Example 7-1 works. In the first line, a local variable cr is brought into existence. In the second and third lines, Frontier begins the process of resolving the name cr by checking whether a local variable cr exists; it does, and so it is the local variable cr, not the database entry cr, that is affected.

The last step in the resolution of references is this: if an object reference is of the x form (a single element), and if it cannot be resolved as a database entry, and if the command was to set the value of the object, Frontier creates the object as a local variable and sets its value. It is as if Frontier declares the variable with local where the script has neglected to do so; and so the action is called an implicit local.

Now we understand how Example 4-1 works. In the first line, a name s is encountered, and Frontier attempts to resolve it. Frontier first looks for s among the existing variables, and fails to find it (no variables exist). Next it tries to resolve it as a partial reference to a database entry; but no s is found along the search paths, and Frontier will not create a database entry from a partial reference of the x form, as we know. So Frontier performs an implicit local, creating a variable s for us; and it sets the value of that variable. When we reach the second line of Example 4-1, s does exist as a variable.

Since implicit locals exist, it is not strictly necessary to declare your variables; but to do so is good programming practice and is strongly recommended, because it helps to protect the database from accidents.

Even when a variable name is declared local, one should still try to avoid having it match a database entry on the search paths - because one might want to refer to that database entry in the same script. Naming a local variable cr makes access to UserTalk's constant cr in the same script more difficult; it can no longer be done by saying cr - you have to say system.verbs.constants.cr, to show that you mean the database entry, not the local variable. Still, such a situation is merely inconvenient, not dangerous.

local Syntax

A local declaration can declare one or more names. There are two completely equivalent syntaxes: the name(s) can go into parentheses in the same line as the word local,3 using a comma-delimited list if there is more than one name; or the name(s) can appear as individual lines of a bundle subordinated to the word local. Also, a local declaration may assign initial values to any of the variables. So this syntax:

local (theNumber = 4, theString = "howdy", anotherVar)

is equivalent to this syntax:

local

    theNumber = 4

    theString = "howdy"

    anotherVar

Since lines can be combined using semicolons, you could also write the second version like this:

local

    theNumber = 4; theString = "howdy"; anotherVar

A local variable created without explicit assignment of an initial value is implicitly assigned a value of nil and a type of unknownType. This value can be used in operations because it will be implicitly coerced (as discussed in Chapter 10, Datatypes ) in a sensible way: as a string, it equates to the empty string; as a number, to 0; and so on.

There is a bug involving initial value assignment within a local declaration if what's on the right side of the equal sign is a non-scalar. For instance, suppose workspace.myOutline is an outline. Then it is bad to say:

local (o = workspace.myOutline) « don't do this

The bug is that the local copy of the outline is never created; instead, o becomes another name for workspace.myOutline. The workaround is to declare the local variable o in one statement, and assign it a value in another.

Another bug is that if you comment out the last line of a local bundle (such as the line containing anotherVar in our second syntax illustration earlier, the script will refuse to compile.

Variable Scope

The scope of a variable is the innermost bundle in which the variable's local declaration appears (look for the word local, not the variable name).4 The following rules then apply:

1. A variable cannot be seen by a command outside its scope.
2. When a variable's scope has finished executing, the variable is destroyed (it "goes out of scope").
3. In case of a name conflict, the innermost scope for that name takes precedence.

Rule 3 is related to the way variable and database entry names are resolved: local scope is more "inner" than the database; the database is global, sitting outside the whole script.

If a variable will be needed in only a limited region of a script, it is common practice to declare the variable in the bundle where it is needed, restricting its scope. Indeed, it is not uncommon, where the program logic has not already required a bundle, to "bundleize" part of a script deliberately to restrict a variable's scope, using the bundle construct introduced in Example 4-4; to help with this, there is a Bundleize command in the Script menu.

Restricting the scope of a variable can help avoid accidental misuse of an already existing variable. For example, here is an impractical but illustrative script.

*Example 7-2* *Playing with scope (continued)*
local (n = 1)
bundle
local (n)
for n = 1 to 3
msg (n)
clock.waitseconds (1)
msg (n)

The msg() verb displays its parameter in Frontier's Main Window.5 The for construct denotes a loop that counts incrementally, so its subordinate bundle will be run three times, using the values 1, 2, and finally 3 for n.

When you run the script, you will see the Main Window count 1, 2, 3, then change back to 1. Why? The script twice declares a variable called n; the scope of the first is the whole script, the scope of the second is the inside of the bundle construct. When we get to the for loop, the second n is innermost in scope, in comparison with the n declared at the start of the program; so it is this n, not the one at the start of the program, that is used in the for loop and displayed in the Main Window. Then, when we get to the last line of the script, we are outside the bundle construct, and the innermost n no longer exists. We are left with only the n created in the first line of the script. This n was initialized to 1 in the first line - and its value was never changed thereafter, because the n in the for loop was a different n! So when we display n now, we see 1.

The technique is a common one. The programmer bundleizes the declaration of the counting variable and the for which uses it. The counting variable has no other purpose in the script; it comes into existence just in time for the for to use it, and goes out of existence when the for is over. Even if a variable by the same name already exists in the script, thanks to the deliberately restricted scope of the counting variable, the other variable of the same name is unharmed.

It is a runtime error to declare a local variable which already exists with identical scope. So, for example:

local (n = 1)

local (n)

    « error

Handler Scope

Handler scope is analogous to variable scope. The scope of a handler is the innermost bundle in which its on definition appears. One might not intuitively think of a handler as coming into existence and going out of existence, but it does. When an on line is encountered, a local variable is created: it has a name (as declared in the on line) and a value (the handler's code, compiled). This variable is available until its scope finishes executing, at which point it is destroyed.

Indeed, handler scope is the same as variable scope, since handlers and variables share the same subset of the namespace - the local namespace. Just as local variables are checked first before trying to resolve a name as a reference to a database entry, so are local handlers - it's the same namespace.

Since local handlers and local variables share the same namespace, a handler and a variable with the same scope cannot have the same name. So this is a runtime error:

local (theWord)

on theWord() « tries to declare theWord twice

    msg("hi")

theWord()

But the following script is legal, because there is only one theWord, which starts out as a handler and is changed to a number:

on theWord()

    return "Hello"

msg (theWord())

theWord = 6

msg (theWord)

And this is legal, because two different scopes are involved:

local (theWord = "hello")

bundle

    on theWord()

        msg("hi")

    theWord()

    clock.waitseconds(1)

msg (theWord)

The same considerations apply to the choice of handler names as for variable names. It would be foolish, though not harmful, to name a local handler long(), because within this handler's scope an attempt to call the UserTalk verb long() would call the local handler instead; you'd have to specify the UserTalk verb by calling it as system.verbs.globals.long().

Uses of Local Handlers

The purpose of a local handler (as opposed to the eponymous handler) is usually to encapsulate a utility operation needed by commands within its scope. This is generally done in the interests of making the script clearer, more elegant, easier to write, and easier to understand.

For example, in Example 5-3, the two wings of our if...else construct are almost identical. We could encapsulate the similarities into a local handler, leaving only the differences, as in the following example.

*Example 7-3* *Abstracting with a local handler (continued)*
on sayHi ()
local (theDay, theMonth, theYear, theHour, theMin, theSec)
date.get (clock.now(), @theDay, @theMonth, \
@theYear, @theHour, @theMin, @theSec)
on sayGood (s) « local handler, encapsulating common functionality
dialog.notify ("Good " + s + ".")
if theHour >= 12
sayGood ("afternoon or evening")
else
sayGood ("morning")
sayHi()

There is no gain here in efficiency; indeed, there is a slight efficiency loss, since we must now carry the overhead of an added level of depth. Nonetheless, this style of coding has a certain clarity that is much to the taste of UserTalk programmers.

Consider the local handler obtainLineText() used in Example 4-7:

on obtainLineText()

    if script.isComment()

        return "« " + op.getLineText() + cr

    else

        return op.getLineText() + cr

op.firstsummit()

op.fullexpand()

local (s)

s = obtainLineText()

while op.go (flatdown, 1)

    local (t = op.level ())

    while --t {s = s + "    "}

    s = s + obtainLineText()

clipboard.putvalue (s)

The script might have been written without the local handler, but the code would then have been harder to understand:

op.firstsummit()

op.fullexpand()

local (s)

if script.isComment()

    s = "« " + op.getLineText() + cr

else

    s = op.getLineText() + cr

while op.go (flatdown, 1)

    local (t = op.level ())

    while --t {s = s + "    "}

    if script.isComment()

        s = s + "« " + op.getLineText() + cr

    else

        s = s + op.getLineText() + cr

clipboard.putvalue (s)

The code of Example 4-7 is vastly more legible and logical. True, the simplification involved is very slight: obtainLineText() is very short, and is called in only two places. But the repeated material represents a specialized, meaningful action, and so encapsulating it into a local handler makes the program noticeably easier to understand.

with, again

The order in which a name inside a with is tested for resolution is:

1. The local namespace inside the with's indented bundle
2. The with's domain
3. The local namespace outside the with
4. The database

This order is the only one that makes sense - Frontier is simply proceeding outwards in its search - but a surprising possibility for error lurks here. This was enunciated so clearly in a now famous note on the Internet by Scott Lawton that it is usually referred to as "Lawton's Law." Scott's code went like this:

local (id, s = whatever)

with Eudora

    id = create(s)

The intention here was to set the local variable id to the result of Eudora.create(). Unfortunately, the Eudora table itself contained an entry called id, and this was what was actually changed (according to step 2). This was a very important value which should not have been touched, and all of Scott's commands involving Eudora broke from then on. It took him a long time to track down the problem.

The database would not have been tromped if Scott had written:

local (s = whatever)

with Eudora

    local (id)

    id = create(s)

But since Scott needed the value of id outside the with block, this wouldn't have worked for him: his only solution was to choose another name for id. The danger is always lurking, and is part of the price one pays for the tight integration of the database with the rest of the namespace.

Implicit Locals

When Frontier generates an implicit local, it has to decide what scope to give the local variable it is about to create. To make this decision, Frontier works its way upward through all bundles containing the line in which the undeclared variable was encountered, and to each bundle it applies following two rules, in order:

1. If the bundle contains a local declaration, Frontier declares the local variable at that level.
2. If the bundle is the top level of a handler, Frontier declares the local variable at that level.

The process stops as soon as the local variable is declared; if neither condition is ever met, Frontier declares the local at summit level.

This script doesn't do anything, but it illustrates these points:

on eponymousHandler()

    on localHandler()

        « x will be declared local here

        for x = 1 to 3

            local (n = 5)

            « y will be declared local here

            y = x * n

A handler's parameters are implicitly declared local at the handler's top level. Here is an impractical but illustrative script:

on countTo (n)

    « n is a local variable with its scope starting here

    local (i)

    for i = 1 to n

        msg (i)

        clock.waitseconds (1)

countTo (3)

Dynamic Scoping

The subordinate bundle of a local handler is considered, on any particular occasion when it is executing, to dwell physically at the point where the handler was called. To illustrate schematically, a script that looks like this:

on myHandler()

    « the statements of myHandler()

local (x = 6)

with Eudora

    myHandler()

is equivalent to this:

local (x = 6)

with Eudora

    « the statements of myHandler()

This is significant in two regards. First, any object references among the statements of myHandler() are now inside a with; resolving them, Frontier will look in the Eudora domain before in local namespace outside the with, and the database in general.

Second, it follows from the nature and rules of scope that the handler's subordinate bundle has access to variables and handlers within whose scope the handler was called. So, for instance, in this script:

on myHandler()

    « the statements of myHandler()

local (x = 6)

with Eudora

    on anotherHandler()

        « the statements of anotherHandler()

    myHandler()

the statements of myHandler() (when it is called in this particular way) are able to see and call anotherHandler(), and they are able to see x. Even more significant, perhaps, they are able to change x.

Since where a handler seems to be, and hence what variables and handlers its statements can see, depends upon where it is called on any particular occasion, UserTalk is said to have dynamic scoping of local handlers.

Dynamic scoping involves calls only to local handlers - it does not extend to calls to script objects in the database. In early versions of Frontier it did, but this was found to cause programmers too much confusion. UserTalk's limited dynamic scoping is not confusing, provided one remembers it; indeed, it seems quite natural. Consider this script:

local (y = 7)

on whatIsY()

    msg(y)

bundle

    local (y = 22)

    whatIsY()

When the script is run, 22 appears in the Main Window, not 7. And this, one feels, is as it should be; whatIsY() is called from within a scope where y is 22, and it would be unnerving if the call had the power to jump outside of that scope.

Variables Global to a Handler

A variable to which the statements of a handler have access, and which is not local to the handler, may be described as global to the handler. Variables global to a handler provide a way to pass information to the handler directly, and not as a parameter. Here, for example, is Example 7-3 rewritten in such a way that sayGood() does not require any parameters:

on sayHi ()

    local (theDay, theMonth, theYear, theHour, theMin, theSec)

    date.get (clock.now(), @theDay, @theMonth, \

        @theYear, @theHour, @theMin, @theSec)

    local (what)

    on sayGood ()

        dialog.notify ("Good " + what + ".")

    if theHour >= 12

        what = "afternoon or evening"

    else

        what = "morning"

    sayGood ()

sayHi()

The handler sayGood()'s bundle can see what directly; what is global to sayGood(), and sayGood() can use what without being passed it as a parameter.

This sort of thing can save a great deal of overhead, because parameters are passed by value in UserTalk. For example, passing a string as a parameter means making a copy of that string; if the string is long, that's a lot of overhead, which builds up significantly if the handler is called many times. Taking advantage of the ability of local handlers to see variables directly can cut back on this overhead.

It is also common to take advantage of the fact that a handler has the power to change any variable that is global to it. Here, for example, is a pattern that is used frequently:

local (htmltext)

on add (s)

    htmltext = htmltext + s

if height != -1

    add (" height=" + height)

if width != -1

    add (" width=" + width)

if hspace != ""

    add (" hspace=" + hspace)

if align != ""

    add (" align=" + align)

The idea is to build up a string, concatenating pieces to it one at a time. The local handler add() is simply to save us from having to say:

htmltext = htmltext + ...

over and over again. The variable htmltext is global to add(), so add() is able to alter its value directly. This greatly reduces the number of times htmltext must be copied in the course of the routine. So, abstracting add() as a local handler threatens to generate overhead, but writing it so as to take advantage of global variables cancels the threat.

Naturally, with increased power comes increased responsibility and danger. Consider this script:

local (n = 1)

on countTo (what)

    for n = 1 to what « uh-oh!

        msg (n)

        clock.waitseconds (1)

countTo (3)

msg (n)

The programmer intended the n inside countTo() as a temporary counting variable, but has tromped on the value of the n declared in the first line. To avoid this, the second n should have been explicitly declared local in countTo(). Again we see the value of declaring one's variables.

Recursion

Since a call to a local handler is necessarily within the handler's own scope, statements within the handler can see the handler's name. This implies that local handlers can call themselves (are recursive) in UserTalk. The permitted depth of recursion is not tremendous, but it is sufficient for many purposes: 40 levels.

There is, however, an important caveat: an eponymous handler executing because its script was called by another script cannot call itself. That's because the on line declaring the handler was never executed; Frontier simply used it to discover that there was an eponymous handler and leapt into that handler to begin executing.

This is not the place for a full discussion of recursion.6 Typically, a handler that calls itself will do so only based on some decision, and will often use the result of that call to decide whether to call itself again. In UserTalk, recursion is usually most appropriate when the increase of depth (the handler calling itself ) mirrors an increase in depth in the entity on which it is to operate (such as file in folder in folder, or entry in table in table).

For example, suppose we wish to know whether a given table or any of its subtables or any of those subtables' subtables (and so on) contains an entry with a given name. The very statement of the problem suggests recursion immediately. The actual implementation involves arrays, datatypes, addresses and dereferencing, which we haven't come to yet; but the following pseudo-code shows how to structure the recursion.

*Example 7-4* *Pseudo-code illustrating recursion*
on tableContains (theTable, entryName)
on thisTableContains (theTable, entryName) « local handler for recursion
for each entry of theTable
if this entry's name is entryName
return true
else
if this entry is a table « recurse into subtable
if thisTableContains (this entry, entryName)
return true
return false
return thisTableContains (theTable, entryName)

1. Strictly speaking, the term "local" is probably redundant. All variables are local variables, as opposed to database entries, which are global. All handlers are local handlers, as opposed to script objects living in the database, which are global. However, from another point of view, database objects are a kind of variable, and eponymous handlers (which are global, in a sense) are a kind of handler; so the term "local" is still useful for emphasis and clarity.

2. Or unless sayHi() is this script's eponymous handler, but that's a special case.

3. A common beginner's error is forgetting the parentheses.

4. How to identify the bundle of an implicit local declaration is discussed later in this chapter.

5. It is very common to use msg() to display information during the course of a script's execution, to help track the script's activities, provide feedback during a lengthy operation, or debug. It has an advantage over the dialog verbs in that it doesn't pause execution or require any user response. It doesn't bring the Main Window to the front, though, so make sure that the Main Window (called Frontier.root in the Window menu) is visible before you run a script where you want to watch the results of msg().

6. Beginning programmers often fear recursion, as something dangerous or magical. The best way to get a sense for recursion is to learn Scheme (a simple Lisp dialect).

7

The Scope of Variables and Handlers