architecture

Overview

Provides a series of commands for managing and validating user-defined architecture. User-defined architecture is a mechanism by which users may define their intended component architecture, mark files within their codebase with one or more file properties, and construct high-level architectural rules to enforce constraints placed on relationships within the codebase.

User-Defined Architecture Format

CodeMRI reads user defined architecture (UDA) from a JSON file located at the path assigned to the uda_file system configuration property. This location defaults to <system root>/uda.json, but can be modified to point to any path within the file system. Take caution when using vaults hosted on a network share to avoid referencing files outside of the network share.

The UDA file consists of an object containing three array properties:

{
  "properties": [],
  "groups": [],
  "rules": []
}

Properties

Properties are arbitrary names and values assigned to a given group of files within the codebase. Users are free to name properties with any valid name, anything that does not contain the characters (,),!,$,*,/,\. There are two classes of properties known to CodeMRI:

  • Special properties are properties whose names have a special meaning to CodeMRI. See the Special Properties section below for more information.

  • Custom properties are arbitrary tags that may be assigned to a specific group of files. These properties have no special meaning to CodeMRI. Custom properties are displayed within the “file list” of specific reports, and can provide a convenient mechanism for users to filter results.

Property Structure

A property contains values, values are a specific instance of that property with a given name. Think of the value graphics of the property component as a component named graphics. Property values may have one or more assignments declaring a group of things that are assigned the given property value. Property values, in some cases, may also have one or more dependencies declaring an intended dependency between the group of items assigned to the given property and the group of items assigned to another property.

At the time of writing, only values of the component property may specify dependencies.

Associations

The Property Structure section describes properties as containing values that may be assigned to items and in some cases depend on other values. Both assignments and dependencies fall into a category called “associations”. Associations are links between elements within the codebase and user-defined architecture. These links originate from the value on which the association is declared, and terminate at the group of items within the association. Associations consist of:

  • subject, which indicates the type of the associated element.

  • group, a list of “matchers” that define the set of associated elements. See Group for a description of matchers.

Subject

A subject indicates the type of element involved in the association. At the time of writing, CodeMRI recognizes two types of subjects:

  • file, individual files within the codebase. In JSON, a file subject looks like: {"type": "file"}

  • property, a value of another property defined in the UDA file. In JSON, a property subject looks like: {"type": "property", "name": "propertyName"}. See sections below for examples in context.

Group

A group is a set of elements defined by a list of matchers. CodeMRI processes these matchers in order. Matchers may include or exclude items from the set.

A typical matcher definition is as follows:

{"type": "inclusion", "matchers": {"name": {"match": ["test/*"]}}}

In this case, we are including items whose name matches the expression test/*. Matchers may specify multiple wildcard expressions. See Wildcard Expressions in the appendix for more information. Matchers contain the following fields:

  • type: Tells CodeMRI how to process the matcher within the context of the group. The two possible values are:

    • inclusion, meaning that items matching the matcher should be included in the group.

    • exclusion, meaning that items matching the matcher should be excluded from the group. A group with only exclusions will be empty.

  • matchers: Provides a mapping of field names to matching expressions. Currently the only type of expression is match, which provides a wildcard expression.

Matchers are processed in the order they appear in the UDA file. Consider the following group:

[
    {"type": "inclusion", "matchers": {"name": {"match": ["graphics/*"]}}},
    {"type": "exclusion", "matchers": {"name": {"match": ["*/tests/*"]}}}
]

CodeMRI will process these matchers in order, as follows:

  • The first matcher is an inclusion. The group at this point contains all of the files under the graphics/* directory tree, relative to the source root.

  • Next, CodeMRI encounters the exclusion. CodeMRI will now remove anything under any subdirectory named tests from the group. For example, if there were files under graphics/tests, they would be removed from the group.

Exclusions are useful for conflict resolution. See Conflict Resolution for more details about conflicts.

Assignments

Property values may be assigned to a group of files within the codebase. Assignments are structured as follows:

{
    "subject": {"type": "file"},
    "group": [
        {"type": "inclusion", "matchers": {"name": {"match": ["graphics/*"]}}},
        {"type": "exclusion", "matchers": {"name": {"match": ["graphics/tests/test_*"]}}}
    ]
}
Dependencies

The values of the component property may contain dependencies that identify intended component-to-component relationships. Dependencies are defined as follows:

{
    "subject": {"type": "property", "name": "component"},
    "group": [
        {"type": "inclusion", "matchers": {"name": {"match": ["io.*"]}}}
    ]
}

Constraints

Constraints are restrictions placed on how relationships flow into and out of a given component. Constraints differ from rules; rules operate on the system as a whole, whereas constraints focus on a single component. Currently, only the access constraint is supported.

Access

The access constraint can be used to declare files in a component as public or private for access in other files outside of the component. A use case for this is to define the API interface of a component. This is done by marking files which are a part of the API interface as public and marking files that should only be accessed by other files within the component as private. Once these access constraints are defined, it is possible to find API violations between components where a private file is accessed from another component.

If access constraints are omitted from a component definition, all files in the component will be marked as public. This ensures that existing uda files will continue working.

If access constraints are present in a component, all files not explicitly marked as public or private will be marked as private by default. This is done to minimize the work needed to define constraints; only the API files need to be marked as public, the rest of the files will be marked as private without any option.

Here is an example constraint definition that shows a public constraint applied to a single file in a component. The file graphics/buffer.c will be marked as public while all other files in the component will be marked as private.:

{
    "type": "access",
    "level": "public",
    "subject" : {"type": "file"},
    "group": [
        {"type": "inclusion", "matchers": {"name": {"match": ["graphics/buffer.c"]}}}
    ]
}

In this example, the private constraint is applied to all files in the component. All files will be marked private.:

{
    "type": "access",
    "level": "private",
    "subject" : {"type": "file"},
    "group": [
        {"type": "inclusion", "matchers": {"name": {"match": ["*"]}}}
    ]
}

Here, the private constraint is applied to one of the files in the component. This will have the same effect as the previous example; all files in the component will be marked private because, files are assumed to be private when constraints are declared.:

{
    "type": "access",
    "level": "private",
    "subject" : {"type": "file"},
    "group": [
        {"type": "inclusion", "matchers": {"name": {"match": ["graphics/buffer.c"]}}}
    ]
}

To define a few files as private and have the rest be public, simply apply a public constraint to everything except the private files using an exclusion. The following constraint will mark eveything as public except graphics/buffer.c.:

{
    "type": "access",
    "level": "public",
    "subject": {"type": "file"},
    "group": [
        {"type": "inclusion", "matchers": {"name": {"match": ["graphics/*"]}}},
        {"type": "exclusion", "matchers": {"name": {"match": ["graphics/buffer.c"]}}}
    ]
}

Special Properties

Certain properties have a special meaning to CodeMRI such that when files are tagged with these properties, CodeMRI alters its behavior. See the sections below for a description of each special property.

component

Components are arbitrary groups of files with a given name and a set of dependencies on other named groups of files. Components are often useful for breaking up a large codebase into logical groups that a human can reason about. As every codebase and organization is different, there is no set way to group a given set of files.

Here is an example of a component definition for a component named reader that depends on another component named io:

{
    "properties": [
        {
            "name": "component",
            "values": [
                {
                    "name": "reader",
                    "assignments": [{
                        "subject": {"type": "file"},
                        "group": [{"type": "inclusion", "matchers": {"name": {"match": ["src/reader/*"]}}}]
                    }],
                    "dependencies": [{
                        "subject": {"type": "property", "name": "component"},
                        "group": [{"type": "inclusion", "matchers": {"name": {"match": ["io"]}}}]
                    }]
                }
            ]
        }
    ]
}
fp_ThirdParty

Marks the files assigned to the value True as third party, to be excluded from economic projections.

Here is an example definition for marking “third party” files:

{
    "properties": [
        {
            "name": "fp_ThirdParty",
            "values": [
                {
                    "name": "True",
                    "assignments": [{
                        "subject": {"type": "file"},
                        "group": [{"type": "inclusion", "matchers": {"name": {"match": ["src/lib/*"]}}}]
                    }]
                }
            ]
        }
    ]
}
fp_Generated

Marks the provided files as generated files. Not directly maintained by the developer, but generated by some automated software.

Here is an example definition for marking “generated” files:

{
    "properties": [
        {
            "name": "fp_Generated",
            "values": [
                {
                    "name": "True",
                    "assignments": [{
                        "subject": {"type": "file"},
                        "group": [{"type": "inclusion", "matchers": {"name": {"match": ["src/generated/*"]}}}]
                    }]
                }
            ]
        }
    ]
}
fp_Test

Marks the provided files as test files. Test files are not part of production code, but part of the code that tests the software to verify that it works as intended.

Here is an example definition for marking “test” files:

{
    "properties": [
        {
            "name": "fp_Test",
            "values": [
                {
                    "name": "True",
                    "assignments": [{
                        "subject": {"type": "file"},
                        "group": [{"type": "inclusion", "matchers": {"name": {"match": ["src/test/*"]}}}]
                    }]
                }
            ]
        }
    ]
}

Conflict Resolution

As property definitions become more complex, conflicts are inevitable. A conflict occurs when within a given property two distinct values are assigned to the same file. In the event of a conflict, CodeMRI will not be able to produce a consistent analysis of the codebase. Attempts to scan a codebase with a conflicting property definition will fail.

When a conflict occurs, CodeMRI will log the files involved in the conflict to the appropriate log file, and produce an error message pointing to the log file containing conflict information.

To resolve conflicts, use exclusion matchers to explicitly define where the conflicting files belong. For example, consider the conflicting component definition below:

{
    "properties": [
        {
            "name": "component",
            "values": [
                {
                    "name": "io",
                    "assignments": [
                        {
                            "subject": {"type": "file"},
                            "group": [
                                {"type": "inclusion", "matchers": {"name": {"match": ["io/*"]}}}
                            ]
                        }
                    ]
                },
                {
                    "name": "network",
                    "assignments": [
                        {
                            "subject": {"type": "file"},
                            "_comment": "Some of our networking files still live under io/network/*.",
                            "group": [
                                {"type": "inclusion", "matchers": {"name": {"match": ["io/network/*", "network/*"]}}}
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

Within this component definition, the files under io/network/* are assigned to two components:

  • io

  • network

Taking the comment in the network component here, it is clear that the conflicting files should be excluded from the io component. To do so, we need to add an exclusion matcher to the group, resulting in the following definition:

{
    "properties": [
        {
            "name": "component",
            "values": [
                {
                    "name": "io",
                    "assignments": [
                        {
                            "subject": {"type": "file"},
                            "group": [
                                {"type": "inclusion", "matchers": {"name": {"match": ["io/*"]}}},
                                {"type": "exclusion", "matchers": {"name": {"match": ["io/network/*"]}}}
                            ]
                        }
                    ]
                },
                {
                    "name": "network",
                    "assignments": [
                        {
                            "subject": {"type": "file"},
                            "_comment": "Some of our networking files still live under io/network/*.",
                            "group": [
                                {"type": "inclusion", "matchers": {"name": {"match": ["io/network/*", "network/*"]}}}
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

Rules

Rules provide a way to enforce certain architectural constraints within the codebase. Rules allow users to explicitly provide architectural intent that cannot be automatically detected by a computer. Additionally rules allow users to reduce the severity of detected issues. This is useful for CI environments where a set of pre-existing issues must be added to a whitelist.

In the most broad definition, a rule is a statement about a given association within a codebase. All rule functionality revolves around allowing, denying, or setting the severity of a set of associations. Think of rules as something akin to a network firewall for a codebase.

Types

There are four types of rules:

  • allow

  • deny

  • mark

  • require

Rules are processed in the order they appear in the UDA file. Rules are shared state, meaning that following rules can negate or partially negate the effects of previous rules, similar to a network firewall definition. See the sections below for a detailed description of each rule type.

General Fields

With the exception of mark rules, all rules share a common set of fields. Fields within nested objects are indicated with . notation. E.g. from.subject for {"from": {"subject": {...}}:

rule_type

Required. Must be one of allow, deny, mark, require. See sections below for more information about individual types.

association_type

The type of association to restrict. At the time of writing only dependency is supported.

from

Required, except in mark rules. The items that originate the relationship to operate on.

from.subject

Required. The type of item originating the relationships to operate on. See Subject above for more information.

from.group

Required. A list of matchers that match items that originate the relationships to operate on. See Group above for more information. Please note that as of the time of writing, rules do not support multiple matchers or exclusion matchers.

to

Required, except in mark rules. The items on the terminal end of the relationship to operate on.

to.subject

Required. The type of item at the terminal end of the relationships to operate on. See Subject above for more information.

to.group

Required. A list of matchers that match items at the terminal end the relationships to operate on. See Group above for more information. Please note that as of the time of writing, rules do not support multiple matchers or exclusion matchers.

Allow

Allow rules remove the illegal state from a set of relationships affected by a deny rule. Think of allow rules as a way to exclude certain relationships from being affected by a deny rule. The syntax for adding an allow rule is:

{
    "rules": [
        {
            "rule_type": "allow",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["ui.*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["services.*"]}}}]
            }]
        }
    ]
}

Note that rules are processed with default allow semantics, rather than default deny semantics. In other words, anything not explicitly denied will be allowed by default.

Deny

Deny rules mark the matching set of relationships as illegal with a severity of error. Deny rules are useful for imposing architectural constraints that go beyond forbidding circular relationships. A good example of this is a layered architecture, wherein relationships that jump over layers should be forbidden.

Here is an example deny rule:

{
    "rules": [
        {
            "rule_type": "deny",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["ui.*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["persistence.*"]}}}]
            }]
        }
    ]
}

Note that rules are processed with default allow semantics, rather than default deny semantics. In other words, anything not explicitly denied will be allowed by default.

To change this behavior, add the following deny rule as the first item in your rules list:

{
    "rules": [
        {
            "rule_type": "deny",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["*"]}}}]
            }]
        }
    ]
}

This will disallow communication between all components in the codebase, requiring the user to explicitly allow any desired communication.

A common use of this functionality is enforcement of layered architectures, which have a very narrow set of constraints on communication. Here is a rule definition for a layered architecture wherein the layers are ui.* -> service.* -> persistence.*:

{
    "rules": [
        {
            "rule_type": "deny",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["*"]}}}]
            }]
        },
        {
            "rule_type": "allow",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["ui.*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["service.*"]}}}]
            }]
        },
        {
            "rule_type": "allow",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["service.*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["persistence.*"]}}}]
            }]
        }
    ]
}

In this example, we first deny everything, then we explicitly allow only the desired layer-to-layer communication. Anything outside of these constraints will be flagged as illegal.

Note that CodeMRI assumes that components are allowed to communicate with themselves. Internal communication within a component is not considered to be a “component relationship”.

Also note that having a standard naming convention for components is important, as wildcard expressions can save considerable time in constructing rule definitions for codebases with many components.

Mark

Mark rules alter the severity of the matching set of relationships. Mark rules support two input sources for relationships:

  • Standard matcher declarations.

  • A CSV or TSV file containing explicit relationships to match. This is useful for whitelisting pre-existing problems.

Here is an example mark rule with standard matcher definitions:

{
    "rules": [
        {
            "rule_type": "mark",
            "association_type": "dependency",
            "states": ["illegal"],
            "severity": "warning",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["ui.*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["persistence.*"]}}}]
            }]
        }
    ]
}

This mark rule would take relationships marked as illegal from ui.* to persistence.* and downgrade the severity from error to warning.

Here is an example of a mark rule with an input file:

{
    "rules": [
        {
            "rule_type": "mark",
            "association_type": "dependency",
            "severity": "warning",
            "input": {
                "path": "known-component-errors.tsv",
                "from": {"subject": {"type": "property", "name": "component"}},
                "to": {"subject": {"type": "property", "name": "component"}}
            }
        }
    ]
}

CodeMRI looks for input files relative to the location of the UDA file in order to ensure that UDA files are portable. This is useful for maintaining a whitelist of pre-existing errors to mark as warnings.

Additional Fields

input

Required if from & to not supplied. Metadata about the input file from which to obtain relationships to operate on.

input.path

Required. The path to the input file. Must be relative to the UDA file, ideally in the same directory.

input.from.subject

Required. The subject identifying the items originating the relationships. See Subject above for more information.

input.to.subject

Required. The subject identifying the items at the terminal end of the relationships. See Subject above for more information.

Require

Require rules elevate “missing” component relationships from warning status to error status. They are useful in cases where a certain class of component-to-component communication is needed. Consider the case where a “security” component must be referenced by services within a system. A rule definition for this might look like:

{
    "rules": [
        {
            "rule_type": "require",
            "association_type": "dependency",
            "from": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name": {"match": ["service.*"]}}}]
            }],
            "to": [{
                "subject": {"name": "component", "type": "property"},
                "group": [{"type": "inclusion", "matchers": {"name":{ "match": ["security"]}}}]
            }]
        }
    ]
}

This will elevate the severity any missing declared relationship between any component matching service.* to the component named security from warning to error.

Comments

As JSON does not have a standard for comments. CodeMRI allows for comments in the form of a _comment property:

{ "_comment": "This is a comment.", "...": "other properties here.." }

Use arrays to break comments over multiple lines:

{
    "_comment": [
        "This is a long comment",
        "broken over multiple lines."
    ]
}

Wildcard Expressions

Wildcard expressions provide a convenient way to match one or more items via a string with special characters:

  • * matches any number of any character. For example, fs/* will match anything starting with fs/*.

  • ? matches any single character.

  • [chars] matches any character within the brackets. For example, [abc] will match a, b, or c.

  • [!chars] matches any character not within the brackets. For example, [!abc] will match anything that is not a, b, or c.

To match names that contain the literal characters *, ?, or [ escape them by wrapping them in brackets. i.e. [*], [?], [[] respectively. We recommend avoiding the use of these special characters in component names.

Sub-commands