Perl has three basic data types: scalars , arrays , and hashes .
Scalars are essentially simple variables. They are preceded by a dollar sign (
$
). A scalar is either a number, a string, or a reference. (A reference is a scalar that points to another piece of data. References are discussed later in this chapter.) If you provide a string where a number is expected or vice versa, Perl automatically converts the operand using fairly intuitive rules.
Arrays are ordered lists of scalars that you access with a numeric subscript (subscripts start at 0). They are preceded by an "at" sign (
@
).
Hashes are unordered sets of key/value pairs that you access using the keys as subscripts. They are preceded by a percent sign (
%
).
Perl stores numbers internally as either signed integers or double-precision floating-point values. Numeric literals are specified in any of the following floating-point or integer formats:
Since Perl uses the comma as a list separator, you cannot use a comma for improving legibility of a large number. To improve legibility, Perl allows you to use an underscore character instead. The underscore only works within literal numbers specified in your program, not in strings functioning as numbers or in data read from somewhere else. Similarly, the leading12345 # integer -54321 # negative integer 12345.67 # floating point 6.02E23 # scientific notation 0xffff # hexadecimal 0377 # octal 4_294_967_296 # underline for legibility
0x
for hex and
0
for octal work only for literals. The automatic conversion of a string to a number does not recognize these prefixes - you must do an explicit conversion.
Strings are sequences of characters. String literals are usually delimited by either single (
'
) or double quotes (
"
).
Double-quoted string literals are subject to backslash and variable interpolation, and single-quoted strings are not (except for
\'
and
\\
, used to put single quotes and backslashes into single-quoted strings). You can embed newlines directly in your strings.
Table 4-1 lists all the backslashed or escape characters that can be used in double-quoted strings.
Code | Meaning |
---|---|
\n
|
Newline |
\r
|
Carriage return |
\t
|
Horizontal tab |
\f
|
Form feed |
\b
|
Backspace |
\a
|
Alert (bell) |
\e
|
ESC character |
\033
|
ESC in octal |
\x7f
|
DEL in hexadecimal |
\cC
|
CTRL-C |
\\
|
Backslash |
\"
|
Double quote |
\u
|
Force next character to uppercase |
\l
|
Force next character to lowercase |
\U
|
Force all following characters to uppercase |
\L
|
Force all following characters to lowercase |
\Q
|
Backslash all following non-alphanumeric characters |
\E
|
End |
Table 4-2 lists alternative quoting schemes that can be used in Perl. They are useful in diminishing the number of commas and quotes you may have to type, and also allow you to not worry about escaping characters such as backslashes when there are many instances in your data. The generic forms allow you to use any non-alphanumeric, non-whitespace characters as delimiters in place of the slash (
/
). If the delimiters are single quotes, no variable interpolation is done on the pattern. Parentheses, brackets, braces, and angle brackets can be used as delimiters in their standard opening and closing pairs.
Customary | Generic | Meaning | Interpolation |
---|---|---|---|
''
|
q//
|
Literal | No |
""
|
qq//
|
Literal | Yes |
``
|
qx//
|
Command | Yes |
()
|
qw//
|
Word list | No |
//
|
m//
|
Pattern match | Yes |
s///
|
s///
|
Substitution | Yes |
y///
|
tr///
|
Translation | No |
A list is an ordered group of scalar values. A literal list can be composed as a comma-separated list of values contained in parentheses, for example:
The generic form of list creation uses the quoting operator(1,2,3) # array of three values 1, 2, and 3 ("one","two","three") # array of three values "one", "two", and "three"
qw//
to contain a list of values separated by white space: qw/snap crackle pop/
A variable always begins with the character that identifies its type:
$
,
@
, or
%
. Most of the variable names you create can begin with a letter or underscore, followed by any combination of letters, digits, or underscores, up to 255 characters in length. Upper- and lowercase letters are distinct. Variable names that begin with a digit can only contain digits, and variable names that begin with a character other than an alphanumeric or underscore can contain only that character. The latter forms are usually predefined variables in Perl, so it is best to name your variables beginning with a letter or underscore.
Variables have the
undef
value before they are first assigned or when they become "empty." For scalar variables,
undef
evaluates to zero when used as a number, and a zero-length, empty string (
""
) when used as a string.
Simple variable assignment uses the assignment operator (
=
) with the appropriate data. For example:
Scalar variables are always named with an initial$age = 26; # assigns 26 to $age @date = (8, 24, 70); # assigns the three-element list to @date %fruit = ('apples', 3, 'oranges', 6); # assigns the list elements to %fruit in key/value pairs
$
, even when referring to a scalar value that is part of an array or hash.
Every variable type has its own namespace. You can, without fear of conflict, use the same name for a scalar variable, an array, or a hash (or, for that matter, a filehandle, a subroutine name, or a label). This means that
$foo
and
@foo
are two different variables. It also means that
$foo[1]
is an element of
@foo
, not a part of
$foo
.
An array is a variable that stores an ordered list of scalar values. Arrays are preceded by an "at" (@) sign.
To refer to a single element of an array, use the dollar sign (@numbers = (1,2,3); # Set the array @numbers to (1,2,3)
$
) with the variable name (it's a scalar), followed by the index of the element in square brackets (the
subscript operator
). Array elements are numbered starting at 0. Negative indexes count backwards from the last element in the list (i.e., -1 refers to the last element of the list). For example, in this list: @date = (8, 24, 70);
$date[2]
is the value of the third element, 70.
A hash is a set of key/value pairs. Hashes are preceded by a percent (%) sign. To refer to a single element of a hash, you use the hash variable name followed by the "key" associated with the value in curly brackets. For example, the hash:
has two values (in key/value pairs). If you want to get the value associated with the key%fruit = ('apples', 3, 'oranges', 6);
apples
, you use
$fruit{'apples'}
.
It is often more readable to use the
=>
operator in defining key/value pairs. The
=>
operator is similar to a comma, but it's more visually distinctive, and it also quotes any bare identifiers to the left of it:
%fruit = ( apples => 3, oranges => 6 );
Every operation that you invoke in a Perl script is evaluated in a specific context, and how that operation behaves may depend on which context it is being called in. There are two major contexts:
scalar
and
list
. All operators know which context they are in, and some return lists in contexts wanting a list, and scalars in contexts wanting a scalar.
For example, the
localtime
function returns a nine-element list in list context:
But in a scalar context,($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime();
localtime
returns the number of seconds since January 1, 1970: Statements that look confusing are easy to evaluate by identifying the proper context. For example, assigning what is commonly a list literal to a scalar variable:$now = localtime();
gives$a = (2, 4, 6, 8);
$a
the value 8. The context forces the right side to evaluate to a scalar, and the action of the comma operator in the expression (in the scalar context) returns the value farthest to the right.
Another type of statement that might be confusing is the evaluation of an array or hash variable as a scalar, for example:
When an array variable is evaluated as a scalar, the number of elements in the array is returned. This type of evaluation is useful for finding the number of elements in an array. The special$b = @c;
$#
array
form of an array value returns the index of the last member of the list (one less than the number of elements).
If necessary, you can force a scalar context in the middle of a list by
using the
scalar
function.
In Perl, only subroutines and formats require explicit declaration. Variables (and similar constructs) are automatically created when they are first assigned.
Variable declaration comes into play when you need to limit the scope of a variable's use. You can do this in two ways:
Dynamic scoping
creates temporary objects within a scope. Dynamically scoped constructs are visible globally, but only take action within their defined scopes. Dynamic scoping applies to variables declared with
local
.
Lexical scoping
creates private constructs that are only visible within their scopes. The most frequently seen form of lexically scoped declaration is the declaration of
my
variables.
Therefore, we can say that a
local
variable is
dynamically scoped
, whereas a
my
variable is
lexically scoped
. Dynamically scoped variables are visible to functions called from within the block in which they are declared. Lexically scoped variables, on the other hand, are totally hidden from the outside world, including any called subroutines unless they are declared within the same scope.
See Section 4.7, "Subroutines " later in this chapter for further discussion.
Copyright © 2001 O'Reilly & Associates. All rights reserved.