Enumerated type
In computer programming, an enumerated type (also called enumeration or enum, or factor in the R programming language, and a categorical variable in statistics) is a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language. A variable that has been declared as having an enumerated type can be assigned any of the enumerators as a value. In other words, an enumerated type has values that are different from each other, and that can be compared and assigned, but which are not specified by the programmer as having any particular concrete representation in the computer's memory; compilers and interpreters can represent them arbitrarily.
For example, the four suits in a deck of playing cards may be four enumerators named CLUB, DIAMOND, HEART, SPADE, belonging to an enumerated type named suit. If a variable V is declared having suit as its data type, one can assign any of those four values to it.
Although the enumerators are usually distinct, some languages may allow the same enumerator to be listed twice in the type's declaration. The names of enumerators need not be semantically complete or compatible in any sense. For example, an enumerated type called color may be defined to consist of the enumerators RED, GREEN, ZEBRA, MISSING, and BACON. In some languages, the declaration of an enumerated type also intentionally defines an ordering of its members; in others, the enumerators are unordered; in others still, an implicit ordering arises from the compiler concretely representing enumerators as integers.
Some enumerator types may be built into the language. The Boolean type, for example is often a pre-defined enumeration of the values FALSE and TRUE. Many languages allow the user to define new enumerated types.
Values and variables of an enumerated type are usually implemented as fixed-length bit strings, often in a format and size compatible with some integer type. Some languages, especially system programming languages, allow the user to specify the bit combination to be used for each enumerator. In type theory, enumerated types are often regarded as tagged unions of unit types. Since such types are of the form , they may also be written as natural numbers.
Contents
Rationale
Some early programming languages did not originally have enumerated types. If a programmer wanted a variable, for example myColor, to have a value of red, the variable red would be declared and assigned some arbitrary value, usually an integer constant. The variable red would then be assigned to myColor. Other techniques assigned arbitrary values to strings containing the names of the enumerators.
These arbitrary values were sometimes referred to as magic numbers since there often was no explanation as to how the numbers were obtained or whether their actual values were significant. These magic numbers could make the source code harder for others to understand and maintain.
Enumerated types, on the other hand, made the code more self-documenting. Depending on the language, the compiler could automatically assign default values to the enumerators thereby hiding unnecessary detail from the programmer. These values may not even be visible to the programmer (see information hiding). Enumerated types can also prevent a programmer from writing illogical code such as performing mathematical operations on the values of the enumerators. If the value of a variable that was assigned an enumerator were to be printed, some programming languages could also print the name of the enumerator rather than its underlying numerical value. A further advantage is that enumerated types can allow compilers to enforce semantic correctness. For instance:
myColor = TRIANGLE
can be forbidden, whilst
myColor = RED
is accepted, even if TRIANGLE and RED are both internally represented as 1.
Conceptually, an enumerated type is similar to a list of nominals (numeric codes), since each possible value of the type is assigned a distinctive natural number. A given enumerated type is thus a concrete implementation of this notion. When order is meaningful and/or used for comparison, then an enumerated type becomes an ordinal type.
Conventions
In some programming conventions, enumerators are conventionally written with upper case letters to indicate they are constants.
Pascal and syntactically similar languages
Pascal
In Pascal, an enumerated type can be implicitly declared by listing the values in a parenthesised list:
var
suit: (clubs, diamonds, hearts, spades);
The declaration will often appear in a type synonym declaration, such that it can be used for multiple variables:
type
cardsuit = (clubs, diamonds, hearts, spades);
card = record
suit: cardsuit;
value: 1 .. 13;
end;
var
hand: array [ 1 .. 13 ] of card;
trump: cardsuit;
The order in which the enumeration values are given matters. An enumerated type is an ordinal type, and the pred
and succ
functions will give the prior or next value of the enumeration, and ord
can convert enumeration values to their integer representation. Standard Pascal does not offer a conversion from arithmetic types to enumerations, however. Extended Pascal offers this functionality via an extended succ
function. Some other Pascal dialects allow it via type-casts. Some modern descendants of Pascal, such as Modula-3, provide a special conversion syntax using a method called VAL
; Modula-3 also treats BOOLEAN
and CHAR
as special pre-defined enumerated types and uses ORD
and VAL
for standard ASCII decoding and encoding.
Pascal style languages also allow for enumeration to be used as array index
var
suitcount: array [cardsuit] of integer;
Ada
In Ada, the use of "=" was replaced with "is" leaving the definition quite similar:
type Cardsuit is (clubs, diamonds, hearts, spades);
In addition to Pred
, Succ
, Val
and Pos
Ada also supports simple string conversions via Image
and Value
.
Similar to C-style languages Ada allows the internal representation of the enumeration to be specified:
for Cardsuit use
(clubs => 1, diamonds => 2, hearts => 4, spades => 8);
Unlike C-style languages Ada also allows the number of bits of the enumeration to be specified:
for Cardsuit'Size use 4; -- 4 bits
Even more, one can use enumerations as indexes for arrays like Pascal, but there are attributes defined for enumerations
Shuffle : constant array(Cardsuit) of Cardsuit :=
(Clubs => Cardsuit'Succ(Clubs), -- see attributes of enumerations 'First, 'Last, 'Succ, 'Pred
Diamonds => Hearts, --an explicit value
Hearts => Cardsuit'Last, --first enumeration value of type Cardsuit e.g. clubs
Spades => Cardsuit'First --last enumeration value of type Cardsuit e.g. spades
);
Like Modula-3 Ada treats Boolean
and Character
as special pre-defined (in package "Standard
") enumerated types. Unlike Modula-3 one can also define own character types:
type Cards is ('7', '8', '9', 'J', 'Q', 'K', 'A');
C and syntactically similar languages
C
The original K&R dialect of the C programming language did not have enumerated types, but they were added in the ANSI standard for C, which became C89. In C, enumerations are created by explicit definitions, which use the enum
keyword and are reminiscent of struct and union definitions:
enum cardsuit {
CLUBS,
DIAMONDS,
HEARTS,
SPADES
};
struct card {
enum cardsuit suit;
short int value;
} hand[13];
enum cardsuit trump;
C exposes the integer representation of enumeration values directly to the programmer. Integers and enum values can be mixed freely, and all arithmetic operations on enum values are permitted. It is even possible for an enum variable to hold an integer that does not represent any of the enumeration values. In fact, according to the language definition, the above code will define CLUBS
, DIAMONDS
, HEARTS
, and SPADES
as constants of type int
, which will only be converted (silently) to enum cardsuit
if they are stored in a variable of that type.
C also allows the programmer to choose the values of the enumeration constants explicitly, even without type. For example,
enum cardsuit {
CLUBS = 1,
DIAMONDS = 2,
HEARTS = 4,
SPADES = 8
};
could be used to define a type that allows mathematical sets of suits to be represented as an enum cardsuit
by bitwise logic operations.
Perl
Dynamically typed languages in the syntactic tradition of C (e.g., Perl or JavaScript) do not, in general, provide enumerations. But in Perl programming one can obtain the same result with the shorthand strings list and hashes (possibly slices):
my @enum = qw(CLUBS DIAMONDS HEARTS SPADES);
my( %set1, %set2 );
@set1{@enum} = (); # all cleared
@set2{@enum} = (1) x @enum; # all set to 1
$set1{CLUBS} ... # false
$set2{DIAMONDS} ... # true
C#
Enumerated types in the C# programming language preserve most of the "small integer" semantics of C's enums. Some arithmetic operations are not defined for enums, but an enum value can be explicitly converted to an integer and back again, and an enum variable can have values that were not declared by the enum definition. For example, given
enum Cardsuit { Clubs, Diamonds, Spades, Hearts };
the expressions CardSuit.Diamonds + 1
and CardSuit.Hearts - CardSuit.Clubs
are allowed directly (because it may make sense to step through the sequence of values or ask how many steps there are between two values), but CardSuit.Hearts*CardSuit.Spades
is deemed to make less sense and is only allowed if the values are first converted to integers.
C# also provides the C-like feature of being able to define specific integer values for enumerations. By doing this it is possible to perform binary operations on enumerations, thus treating enumeration values as sets of flags. These flags can be tested using binary operations or with the Enum type's builtin 'HasFlag' method.
The enumeration definition defines names for the selected integer values and is syntactic sugar, as it is possible to assign to an enum variable other integer values that are not in the scope of the enum definition.[1][2][3]
C++
C++ has enumeration types that are directly inherited from C's and work mostly like these, except that an enumeration is a real type in C++, giving additional compile-time checking. Also (as with structs) the C++ enum
keyword is automatically combined with a "typedef", so that instead of calling the type "enum name", one simply calls it "name." This can be simulated in C using a typedef: typedef enum {VALUE1, VALUE2} name;
C++11 provides a second, type-safe enumeration type that is not implicitly converted to an integer type. It allows io streaming to be defined for that type. Additionally the enumerations do not leak, so they have to be used with Enumeration Type::enumeration
. This is specified by the phrase "enum class". For example:
enum class Color {Red, Green, Blue};
The underlying type is an implementation-defined integral type that is large enough to hold all enumerated values (it doesn't have to be the smallest possible type!). In C++ you can specify the underlying type directly. That allows "forward declarations" of enumerations:
enum class Color : long {Red, Green, Blue}; // must fit in size and memory layout the type 'long'
enum class Shapes : char; // forward declaration. If later there are values defined that don't fit in 'char' it is an error.
Go
Go uses the iota
keyword to create enumerated constants.[4]
type ByteSize float64
const (
_ = iota // ignore first value by assigning to blank identifier
KB ByteSize = 1 << (10 * iota)
MB
GB
)
Java
The J2SE version 5.0 of the Java programming language added enumerated types whose declaration syntax is similar to that of C:
enum Cardsuit { CLUBS, DIAMONDS, SPADES, HEARTS };
...
Cardsuit trump;
The Java type system, however, treats enumerations as a type separate from integers, and intermixing of enum and integer values is not allowed. In fact, an enum type in Java is actually a special compiler-generated class rather than an arithmetic type, and enum values behave as global pre-generated instances of that class. Enum types can have instance methods and a constructor (the arguments of which can be specified separately for each enum value). All enum types implicitly extend the Enum
abstract class. An enum type cannot be instantiated directly.[5]
Internally, each enum value contains an integer, corresponding to the order in which they are declared in the source code, starting from 0. The programmer cannot set a custom integer for an enum value directly, but one can define overloaded constructors that can then assign arbitrary values to self-defined members of the enum class. Defining getters allows then access to those self-defined members. The internal integer can be obtained from an enum value using the ordinal()
method, and the list of enum values of an enumeration type can be obtained in order using the values()
method. It is generally discouraged for programmers to convert enums to integers and vice versa.[6] Enumerated types are Comparable
, using the internal integer; as a result, they can be sorted.
The Java standard library provides utility classes to use with enumerations. The EnumSet
class implements a Set
of enum values; it is implemented as a bit array, which makes it very compact and as efficient as explicit bit manipulation, but safer. The EnumMap
class implements a Map
of enum values to object. It is implemented as an array, with the integer value of the enum value serving as the index.
TypeScript
A helpful addition to the standard set of datatypes from JavaScript is the 'enum'. Like languages like C#, an enum is a way of giving more friendly names to sets of numeric values.
enum Color {Red, Green, Blue};
var c: Color = Color.Green;
By default, enums begin numbering their members starting at 0. You can change this by manually setting the value of one its members. For example, we can start the previous example at 1 instead of 0:
enum Color {Red = 1, Green, Blue};
var c: Color = Color.Green;
Or, even manually set all the values in the enum:
enum Color {Red = 1, Green = 2, Blue = 4};
var c: Color = Color.Green;
A handy feature of enums is that you can also go from a numeric value to the name of that value in the enum. For example, if we had the value 2 but weren't sure which that mapped to in the Color enum above, we could look up the corresponding name:
enum Color {Red = 1, Green, Blue};
var colorName: string = Color[2];
alert(colorName);
Dynamically typed languages
Python 3.4
from enum import Enum
class Cards(Enum):
clubs = 1
diamonds = 2
hearts = 3
spades = 4
Note that for enums with larger option sets, assigning numbers one by one might be impractical. In such cases, using the range()
generator is possible:
class Cards(Enum):
# clubs = 0, diamonds = 1, hearts = 2, spades = 3
clubs, diamonds, hearts, spades = range(4)
Fortran
Fortran only has enumerated types for interoperability with C; hence, the semantics is similar to C and, as in C, the enum values are just integers and no further type check is done. The C example from above can be written in Fortran as
enum, bind( C )
enumerator :: CLUBS = 1, DIAMONDS = 2, HEARTS = 4, SPADES = 8
end enum
Visual Basic/VBA
Enumerated datatypes in Visual Basic (up to version 6) and VBA are automatically assigned the "Long
" datatype and also become a datatype themselves:
Enum CardSuit
Clubs
Diamonds
Hearts
Spades
End Enum
Sub EnumExample()
Dim suit As CardSuit
suit = Diamonds
MsgBox suit
End Sub
Example Code in vb.Net
Enum CardSuit
Clubs
Diamonds
Hearts
Spades
End Enum
Sub EnumExample()
Dim suit As CardSuit
suit = CardSuit.Diamonds
MsgBox(suit)
End Sub
Algebraic data type in functional programming
In functional programming languages in the ML lineage (e.g., SML, OCaml and Haskell), an algebraic data type with only nullary constructors can be used to implement an enumerated type. For example (in the syntax of SML signatures):
datatype cardsuit = Clubs | Diamonds | Hearts | Spades type card = { suit: cardsuit; value: int } val hand : card list val trump : cardsuit
In these languages the small-integer representation is completely hidden from the programmer, if indeed such a representation is employed by the implementation. However, Haskell has the Enum
type class which a type can derive or implement to get a mapping between the type and Int
.
Lisp
Common Lisp uses the member type specifier, e.g.
(deftype cardsuit ()
'(member club diamond heart spade))
that states that object is of type cardsuit if it is #'eql
to club, diamond, heart or spade. The member type specifier is not valid as a CLOS parameter specializer, however. Instead, (eql atom)
, which is the equivalent to (member atom)
may be used (that is, only one member of the set may be specified with an eql type specifier, however, it may be used as a CLOS parameter specializer.) In other words, in order to define methods to cover an enumerated type, a method must be defined for each specific element of that type.
Additionally,
(deftype finite-element-set-type (&rest elements)
`(member ,@elements))
may be used to define arbitrary enumerated types at runtime. For instance
(finite-element-set-type club diamond heart spade)
would refer to a type equivalent to the prior definition of cardsuit, as of course would simply have been using
(member club diamond heart spade)
but may be less confusing with the function #'member
for stylistic reasons.
Databases
Some databases support enumerated types directly. MySQL provides an enumerated type ENUM
with allowable values specified as strings when a table is created. The values are stored as numeric indices with the empty string stored as 0, the first string value stored as 1, the second string value stored as 2, etc. Values can be stored and retrieved as numeric indexes or string values.
XML Schema
XML Schema supports enumerated types through the enumeration facet used for constraining most primitive datatypes such as strings.
<xs:element name="cardsuit">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Clubs"/>
<xs:enumeration value="Diamonds"/>
<xs:enumeration value="Hearts"/>
<xs:enumeration value="Spades"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
References
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
External links
The Wikibook Ada Programming has a page on the topic of: Enumeration |