Modeling Metrics for UML Diagrams - Richard Seidl

Written by Richard Seidl | 08/31/2010

UML quantity metrics

Quantity metrics are counts of the diagram and model types contained in the UML model. The model types are further subdivided into counts of entities and counts of relationships.

The 9 UML diagram types are:

Number of…

Use case diagrams
Activity diagrams
Class diagrams
Sequence diagrams
Interaction diagrams
State diagrams
Component diagrams
Distribution diagrams
Design diagrams

The 17 entity types are:

Number of…

Sub-systems
Use cases
Actuators
Components
Interfaces
Classes
Base/superclasses
Methods
Parameters
Attributes
Activities
Objects
States
Rules
Stereotypes
Design entities
Design entities referenced

The 10 relationship types are:

Number of…

Uses
Associations
Generalizations
interactions
Class hierarchy levels
Method calls
activity flows
State transitions
test cases
Design relationships

These variables or element numbers were selected on the basis of their relation to the objectives of object-oriented system design according to Basili and Rombach’s goal-question-metric method.

UML complexity metrics

Complexity metrics are calculations to determine selected complexities. Complexity is defined here as the ratio of entities to relationships. The size of a set is determined by the number of elements in this set. The complexity of a set is a question of the number of relationships between the elements of this set. The more connections or dependencies there are in relation to the number of elements, the greater the complexity. The complexity of an individual entity is determined by the number of sub-entities in relation to the number of relationships between these sub-entities. The overall complexity of the design can be simply stated as follows:

Against this background, the following complexity types were defined for UML models.

Complexity of the object interaction

The more interactions between objects and the more associations between classes there are, the greater the complexity. In this way, both the abstract level of the class and the physical level of the objects are taken into account. This measure is an inverse coupling metric. It is based on the empirical finding that systems with many dependencies between their parts are difficult to maintain.

Complexity of the class hierarchy

The more hierarchical levels there are in the class hierarchies, the more dependent the lower level classes are on the higher level classes. Deep inheritance has often been criticized for leading to increased complexity. This metric corresponds to the depth of tree metric of Chidamer and Kemerer. It is based on empirical evidence that object-oriented systems with deep inheritance trees (e.g. > 3) are more error-prone than others.

Complexity of the class data

The more data attributes a class has, the higher its complexity. This corresponds to the class attribute metric in the mood metrics. The design goal is to have many classes with few data attributes each, as opposed to few classes with many attributes each. This goal is based on the assumption that it is easier to test and maintain smaller data sets.

Complexity of the class functions

The more methods, i.e. functions, a class has, the higher its complexity, whereby it is assumed that each class has at least two implicit functions - a constructor and a destructor. This corresponds to Chidamer and Kemerer’s Number of Methods metric. The design goal is to have many classes with a minimum number of functions, as opposed to few classes with many methods. This goal is based on the assumption that it is easier to maintain and test a system that is divided into many small pieces of functionality.

Complexity of the object state

Objects are instances of a class. Objects have states. The more they have, the more complex they are. A simple class is a singleton with one object that has one static state. A complex class is a class with several objects, each of which has several possible states. Neither the CK nor the MOOD metrics take state complexity into account, although it is a major driver of test effort along with the cyclomatic complexity of the methods. The design goal is to have many classes, each with a minimum number of functions, as opposed to as few object states as possible, but this is determined by the application. If an object such as an account has many states, e.g. opened, balanced, overdrawn, locked, closed, etc., they all need to be created and tested.

Complexity of the state transitions

The connecting lines of a state diagram represent the transitions from one state to another. A given state can have any number of successor states. The more there are, the higher the complexity of the state transition graph. As with the McCabe measure of cyclomatic complexity, we are actually measuring the ratio of edges to nodes in a graph. Only here the nodes are not statements but states, and the edges are not branches but transitions. The design goal is to have as few transitions as possible, as each state transition must be tested at least once and this drives up the test costs.

Activity control flow complexity

The connecting lines of an activity diagram represent the control flow from one activity to another. They can be conditional or unconditional flows. Conditional flows increase the complexity of the modeled process. An activity can have any number of subsequent activities. The more there are and the more conditional there are, the greater the complexity of the process.

Complexity of the use case

Use cases, as coined by Ivar Jacobson, are instances of system usage. A user or system actor calls up a use case. This is an instance of use. The relationships between use cases can have different meanings. They can mean use or extension or inclusion or inheritance. The more relationships there are, the greater the complexity of the use. The goal of the design is to reduce complexity by limiting the number of dependencies between use cases. On the other hand, if the application needs them, then they must be included. Otherwise, the complexity is just pushed to another layer.

Complexity of the actor interaction

System actors trigger the use cases. Each individual actor can trigger one or more use cases. The more use cases there are per actor, the more complex the relationship between actors and the system. From an actor’s point of view, a system is complex if it has to deal with many use cases. The more use cases there are per actor, the greater the complexity. A system that has only one use case per actor is simple because it is partitioned according to the actors. The aim of the design is to limit the number of use cases per actor. Of course, more actors increase the size of the system in use case points.

Overall complexity of the construction

The overall complexity of the design is calculated as the ratio between the sum of all design entities and the sum of all design relationships.

A design in which each entity has only a few relationships can be considered less complex than a system design in which the number of relationships per entity is high. This reflects complexity as the ratio of the number of relationships between elements of a set and the number of elements in a set. The more elements there are, the larger the set. The more relationships there are, the higher the complexity of the set. The design goal is to minimize the number of relationships between the design objects.

UML quality metrics

The metrics for design quality are calculations to determine selected qualities. Quality is defined here as the ratio of the state the model is in to the state it should be in. The quality measurement requires a standard for the UML model. The actual state of the model is then compared with this standard. The closer the model is to this standard, the higher its quality. In German, the overall design quality can be expressed simply by the ratio:

The upper limit of the metric is 1. If the ACTUAL value exceeds the TARGET value, the quality target has been exceeded. A quotient coefficient of 0.5 indicates average quality. It should be borne in mind that quality is relative. Taken on its own, the quotient may not mean that much. However, when compared with the quotient derived from another design in exactly the same way, it indicates that one design is of better or worse quality than the other, at least in relation to the quality characteristic measured. As there is no absolute quality scale, the quality of one system design can only be assessed in relation to the quality of another. The following quality characteristics were selected to assess the quality of a UML model.

Degree of class coupling

The class coupling is the reciprocal of the interaction complexity. It is calculated using the equation:

The more interactions and associations there are between objects and classes, the greater the dependency of these objects and classes on each other. This mutual dependency is referred to as coupling. Classes with a high degree of coupling have a greater range of influence. If they are changed, it is more likely that the other classes will also be affected. The aim of the design is to have as few dependencies as possible, i.e. the coupling should be low. This quality feature is based on the empirical finding that a high coupling is associated with a larger impact domain, a higher error rate and higher maintenance effort.

Degree of class cohesion

Class cohesion is measured by the number of data attributes in a class in relation to the number of class methods. It is calculated using the equation:

The term cohesion refers to the degree to which the functions of a module belong together. Functions belong together if they process the same data. This can be described as data coupling. The less data is used by the same functions, the better. Classes with high cohesion have many methods and few attributes. Classes with many attributes and few methods have less cohesion. The design goal is to have as few common attributes as possible for the same methods. This quality characteristic is based on the hypothesis that high cohesion is associated with high maintainability. This hypothesis has never really been proven.

Degree of modularity

Modularity is a measure of decomposition. It expresses the degree to which a large system has been broken down into many small parts. The theory is that it is easier to deal with smaller code units. The modularity of classes is determined by the number of attributes and methods that a class has. It is expressed by the equation:

There is a prevailing belief, undermined by numerous field trials, that many smaller units of code are easier to change than a few larger ones. The old Roman principle of “divide et imperum” also applies to software. It has not been proven that smaller modules are necessarily more error-free. Therefore, the justification for modularity is based on simplicity of change. When measuring code, modularity can be determined by comparing the actual size of code units in instructions with a predefined maximum size. In an object-oriented design, the elementary units are the methods. The number of methods per class should not exceed a defined limit. To measure the modularity of UML, it is advisable to compare the total number of methods with the minimum number of methods per class multiplied by the total number of classes. The design goal here is to have as few methods per class as possible to encourage the designer to create more and smaller classes.

Degree of portability

Portability at the design level is a measure of the ease with which the architecture can be ported to another environment. It is influenced by the way in which the design is packaged. Many small packages are easier to port than a few large ones. It is therefore important to keep the size of the packages as small as possible. The package size is a question of the number of classes per package. At the same time, packages should only have a few dependencies on their environment. The fewer interfaces each package has, the better. The portability of a system is expressed by the equation:

The justification for this quality feature goes in the same direction as that of modularity. The number of classes per package should not exceed a certain limit, and a package should also not have more than a certain number of interfaces with its environment, as interfaces connect a package to its environment. The design goal is to create packages with a minimum number of classes and interfaces.

Degree of reusability

Reusability is a measure of the ease with which code or design units can be removed from their original environment and transplanted into another environment. This means that there should be a minimum of dependencies between design units. Dependencies are expressed in UML as generalizations, associations and interactions. Therefore, the equation to measure the degree of dependency is:

The more generalizations, associations and interactions there are, the more difficult it is to remove individual classes and methods from the current architecture and reuse them in another. As with plants, if their roots are entangled with the roots of neighboring plants, it is difficult to transplant them. The entangled roots must be severed. This also applies to software. The degree of dependency should be as low as possible. Inheritance and interaction with other classes increases the degree of dependency and lowers the degree of reusability. The design goal here is to have as few dependencies as possible.

Degree of testability

Testability is a measure of the effort required to test a system in relation to the size of the system. The less effort required, the higher the degree of testability. The test effort is determined by the number of test cases to be tested and the width of the interfaces, whereby this width is expressed by the number of parameters per interface. The equation for calculating testability is as follows:

The number of test cases required is calculated from the number of possible paths through the system architecture. To test an interface, the parameters of this interface must be set to different combinations. The more parameters it contains, the more combinations need to be tested. Practical experience has shown that it is easier to test several narrow interfaces, i.e. interfaces with few parameters, than several wide interfaces, i.e. interfaces with many parameters. Thus, not only the number of test cases, but also the width of the interfaces influences the testing effort. The design objective here is to design an architecture that can be tested with as little effort as possible. This can be achieved by minimizing the possible paths through the system and by modularizing the interfaces.

Degree of conformity

Conformance is a measure of the extent to which design rules are adhered to. Every software project should have a convention for naming entities. There should be prescribed names for data attributes and interfaces as well as for classes and methods. It is the responsibility of the project management to ensure that these naming conventions are made available. It is the responsibility of quality assurance to ensure that they are adhered to. The equation for compliance is very simple:

Incomprehensible names are the biggest obstacle to understanding code. No matter how well the code is structured, it remains incomprehensible as long as the code content is unclear due to inadequate data and procedure names. The names assigned in the UML diagrams are adopted in the code. They should therefore be selected with great care and comply with a strict naming convention. The aim is to encourage designers to use meaningful, standardized names in their design documentation.

Degree of consistency

Consistency in the design implies that the design documents are consistent with each other. You should not refer to a class or method in a sequence diagram that is not also contained in a class diagram. This would be inconsistent. The same applies to the methods in the activity diagrams. They should match the methods in the sequence and class diagrams. The parameters that are passed in the sequence diagrams should also be the parameters that are assigned to the methods in the class diagrams. The class diagrams are therefore the base diagrams. All other diagrams should match them. If not, there is a consistency problem. The equation for calculating consistency is as follows:

When we measure the degree of consistency, we come across one of the greatest weaknesses of the UML design language. It is inherently inconsistent. This is because it has been glued together from many different design diagram types, each of which has its own origin. State diagrams, activity diagrams and collaboration diagrams existed long before UML was born. They were adopted from structured design. The basis of object-oriented design is Grady Booch’s class diagram. Use case and sequence diagrams were added later by Ivar Jacobson. There was therefore never a uniform design for the UML language. The designer has the option of creating the different diagram types completely independently of each other. If the UML design tool does not check this, this leads to inconsistent naming. The design goal here is to force designers to use a common namespace for all diagrams and to ensure that the referenced methods, parameters and attributes are defined in the class diagrams.

Degree of completeness

The completeness of a design could mean that all requirements and use cases specified in the requirements document are covered by the design documentation. To check this, a link to the requirements repository would be necessary and it would have to be ensured that the same names are used for the same entities in the design as in the requirements text. Unfortunately, the state of information technology is far removed from this ideal. Hardly any IT project has a common namespace for all its documents, let alone a common repository. For this reason, only formal completeness is measured here, i.e. that all the required diagrams are available. The degree of completeness is a simple relationship between completed and required documents.

The goal of the design here is to ensure that all the UML diagram types required for the project are actually present. As with all UML projects this author has ever tested, the design is never finalized. The pressure to start coding is too great and once coding starts, the design becomes obsolete.

Degree of Compliance

The ultimate quality of a system design is that it fulfills the requirements. Not everything that is measured is important and much of what is important cannot be measured. That is certainly true here. Whether the user’s requirements are really met can only be determined by testing the end product against the requirements. The most that can be done is to compare the actors and use cases in the design with those in the requirements. Each functional requirement should be assigned to a use case in the requirements document. If this is the case, the use cases in the requirements document should cover all functional requirements. If the number of use cases in the design matches the number of use cases in the requirements, we can consider the design to be compliant with the requirements, at least formally. This can be expressed by the coefficient:

If more use cases were designed than required, this only shows that the solution is bigger than the problem. If fewer use cases are designed, then the design is obviously not compliant. The design goal here is to design a system that covers all requirements, at least at the use case level.

UML design size metrics

The design size metrics are calculated values to represent the size of a system. Of course, it is not the system itself that is measured here, but a model of the system. The system itself can only be measured when it is finished. Size measures are needed at an early stage in order to predict the effort required to produce and test a system. These measures can be derived from the requirements by analyzing the requirements texts or at design time by analyzing the design diagrams. Of course, both measurements can only be as good as the requirements and/or the design that is being measured. Since the design is more detailed and more likely to be complete, measuring the design size leads to a more reliable estimate. However, the design is not complete until much later than the requirements. This means that the initial cost estimate must be based on the requirements. If the design-based estimate exceeds the original estimate, it is necessary to remove functionality, i.e. omit less important use cases and objects. If the design-based estimate differs significantly from the original, it is necessary to stop the project and renegotiate the proposed time and cost. In any case, the project should be recalculated when the design is finalized.

There are various methods for estimating software project costs. Each is based on a different metric. When estimating a project, one should always estimate using at least three different methods. For this reason, five size metrics are used to give the estimator a choice. The five size metrics used are:

Data points
Function points
Object points
Use case points
Test cases

Data points

Data points is a size measure that was originally published by Sneed in 1990. It is intended to measure the size of a system solely on the basis of its data model, but including the user interfaces. It is a product of 4th generation software development, where applications are built around the existing data model. The data model in UML is expressed in the class diagrams. The user interfaces can be identified in the use case diagrams. This leads to the following calculation of data points:

Function points

Function points is a size measure that was originally introduced by Albrecht at IBM in 1979. It is intended to measure the size of a system based on its inputs and outputs together with its data files and interfaces. Inputs are weighted from 3 to 6, outputs from 4 to 7, files from 7 to 15 and system interfaces from 5 to 10. This method of system dimensioning is based on the structured system analysis and design technique. It has evolved over the years, but the basic counting scheme has remained unchanged. It was never intended for object-oriented systems, but can be adapted. In a UML design, the classes are closest to the logical files. The interactions between actors and use cases are closest to user inputs and outputs. The interfaces between the classes can be interpreted as system interfaces. With this rough approximation, we arrive at the following calculation of function points:

Object points

Object points were developed by Sneed in 1996 specifically for measuring the size of object-oriented systems. The idea was to find a size measure that could be easily derived from an object design. As such, it fits perfectly with UML design. Object points are obviously the best size measure for an object model. Classes weigh 4 points, methods weigh 3 points, interfaces weigh 2 points and attributes/parameters weigh one point. In this way, object points are calculated as:

UseCase-Points

UC points were introduced in 1993 by a Swedish student working at Ericsson named G. Karner. The idea was to estimate the size of a software system based on the number of actors and the number and complexity of use cases. Both actors and use cases were categorized into three levels - simple, medium and difficult. The actors were rated on a scale of 1 to 3, while the use cases were rated on a scale of 5 to 15. Both are multiplied together to obtain the unadjusted use case points. This method is also suitable for measuring the size of a UML design, provided that the use cases and actors are all specified. Here, the median values are used to classify all actors and use cases, but extended by the number of interactions between actors and use cases.

Test cases

Test cases were first used by Sneed in 1978 as a yardstick for estimating the test effort for the Siemens Integrated Transport System - ITS. The motivation behind this was to calculate the module test on the basis of test cases. A test case was defined as the equivalent of a path through the test object. Much later, the method was revived to estimate the costs of testing systems. When testing systems, a test case is equivalent to a path through the system. It starts at the interaction between an actor and the system and either follows a path through the activity diagrams or traverses the sequence diagrams via interactions between classes. There should be one test case for each path through the interaction diagrams and for each object state specified in the state diagrams. The number of test cases therefore results from the use case interactions times the number of class interactions times the number of object states. It is calculated as follows:

Automated analysis of UML designs with UMLAudit

The UMLAudit tool was developed to measure UML designs. UMLAudit is a member of the SoftAudit toolset for automated quality assurance. This tool set also contains analysis tools for English and German requirement texts as well as for all leading programming languages, the most common database schema languages and several languages for defining user and system interfaces. UMLAudit contains an XML parser that parses the XML files generated by the UML modeling tool to display the diagrams. The diagram and model instance types, names and relationships are included as attributes that can be easily recognized by their model types and names. The measurement object is the XML schema of the UML 2 model with its model types, as specified by the OMG.

The first step of UMLAudit is to collect the design types and names from the XML files and store them in tables. The second step is to go through the tables and count them. The third step is to check the names against the naming convention templates. The last step is to check the referential consistency by comparing the referenced entities with the defined entities. As a result, two outputs are generated:

a UML defect report and
a UML metrics report.

The UML defect report is a log of rule violations and discrepancies organized according to diagrams. There are currently only two types of defects:

Inconsistent references and
violations of the rules.

If a diagram, e.g. a state, activity or sequence diagram, refers to a class, method or parameter that is not defined in the class diagram, an inconsistent reference is reported. If the name of an entity deviates from the naming rules for this entity type, a naming violation is reported. These deficiencies are totaled and compared to the number of model types and type names to determine design conformance.

The UML metrics report lists the quantity, complexity and quality metrics at file and system level. The quantity metrics are further subdivided into diagram quantities, structure quantities, relationship quantities and size metrics.

Fazit

Judging a software system by its design is like judging a book by its table of contents. If the table of contents is very fine-grained, you can judge the structure and organization of the book and make assumptions about the content. The same is true for UML. If the UML draft is fine-grained down to a detailed level, it is possible to make an assessment and estimate costs based on the draft. If it is only coarse-grained, the assessment of the system will be superficial and the estimate unreliable. Measuring the size, complexity and quality of anything can only be as accurate as the thing you are measuring. UML is just a model and models do not necessarily reflect reality. In fact, they rarely do. UML models are often incomplete and inconsistent in practice, making it difficult to build a test on them. This remains the biggest obstacle to model-based testing.

View full post