CodeMRI Administration Example - Analyzing Many Codebases

Configuring Your Projects & Systems

Test source code we are using

In these examples we got open source code for early versions of the Apache HTTPd web server, early versions of the Linux Kernel, and all released versions of Axis2.

You can follow along with these same codebases by going to the following websites and downloading <package>-<version>.tar.gz files.

https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/v0.99/ https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/ https://mirrors.edge.kernel.org/pub/linux/kernel/v1.1/ https://archive.apache.org/dist/httpd/ https://github.com/apache/axis-axis2-java-core/tags

https://mirrors.edge.kernel.org/pub/linux/kernel/v1.1/

https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/

https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/v0.99/

Where things might go

Location of the CodeMRI Data Vault where databases, reports, and other things are stored by CodeMRI

  • $HOME/Documents/test_datavault

Location where you store the source code you are analyzing

  • $HOME/Documents/test_sourcecode

You can put these wherever you like. I am going to do the following:

# on any Unux machine, including Linux and MacOS export CMRI_VAULT=$HOME/Documents/test_datavault export CMRI_SOURCECODE=$HOME/Documents/test_sourcecode

Set these in your UNIX shell to set the variables temporarily.

Add them to your .bashrc or .zshrc file if you want them to always be available.

Get Your Source Code Ready

In this demo, we are putting source code into a directory structure with the name ‘test_sourcecode’ that will mirror the structure that the CodeMRI Data Vault will take when it is processed.

dan $ls -l test_sourcecode total 8 drwxr-xr-x 18 dan staff 576 Oct 14 01:35 Apache_HTTPd drwxr-xr-x 24 dan staff 768 Oct 14 01:36 Axis2 drwxr-xr-x 44 dan staff 1408 Oct 14 02:02 Linux_Kernel -rw-r--r-- 1 dan staff 150 Oct 12 10:39 README

 

We downloaded the .tar.gz files into this directory structure:

 

The ‘tar’ command was used to unpack each source code directory. For this example we used Linux shell scripting to automate the process. Once this was complete, we moved the .tar.gz files into a subdirectory called ‘tar_files’:

 

The Axis2 source code directory then looked like the following:

 

In the operating system ‘Finder' application, the entire layout looked like the following

We now have several versions of multiple projects stored for use by CodeMRI.

Install CodeMRI - follow installation instructions

Configure the CodeMRI Data Vault

Make sure your CodeMRI license is set up

Add Projects - via CodeMRI Command Line Shell

On the UNIX command line:

Alternatively, do this inside the CodeMRI shell itself:

 

 

Note: If you ever want to remove those projects, run the following

Add systems - via CodeMRI Command Line Shell

Using Apache_HTTPd as the example

 

At your terminal, you can open a CodeMRI shell (cmri) to run commands interactively

To add each system one at a time, type the following

Note: You could remove all the Apache_HTTPd systems from the DataVault by typing this:

Add systems - via Linux / Windows command Line

Using Linux_Kernel as the example

Helps with system administration & automation

This command tells you that important information is being captured

 

This command produces strings you will want to cut an paste into the terminal to add systems

Now run all the ‘cmri system add’ commands in a shell

Result:

Add systems - via CodeMRI Batch Mode

Using Axis2 as the example

Add systems - via Batch Script

Save the list of commands as a file: /tmp/Axis2_add.cmri and then run cmri in batch mode

Select Systems to Operate On

CodeMRI commands operate on ‘selected’ projects and systems in a DataVault. When you first start CodeMRI shell, no systems are selected (unless environment variables are set up.)

 

You can select a single system to operate on. Note below that one system now shows ‘SELECTED’ to be ‘True’

You can select multiple systems by using ‘globbing'. This allows CodeMRI to run operations on multiple systems to be run in parallel.

See What Kind of Jobs Can Be Run

See details for ‘produce_reports’ job

Produce Reports for a System (Default Set of Reports)

First, select a single system

Then run a job called ‘produce_reports’ which generates a number of Excel files about the system

You can also do this for multiple systems in parallel. Your machine’s resources (CPUs, RAM) will be used to determine how many jobs will run in parallel:

First, select a single system

While you have jobs running in one terminal, you could open up a second terminal to see the history of jobs that are running and have been run:

Produce the ‘Technical Health Improvement Plan’ (THIP)

If the system has ‘core’ files, you will probably also want to produce a ‘Technical Health Improvement Plan’ (THIP). This may be a long-running job, and only is necessary if the system has cores, so isn’t run as part included in the default set of Excel files generated

Look at CodeMRI Excel Files

The previous commands (produce_reports, produce_thip) create a number of Excel files in the <data vault>/reports directory for each system

Others Jobs You Might Want To Run

Settings for Systems and Projects

The settings and what they do

Each project and system stored in a Data Vault has settings. To see the list of settings, do the following to show the options:

Modifying settings

Now let’s select a single system and look at it’s settings:

Now let’s look at all the settings:

Let’s change one of these settings. Note that ‘release_date’ is wrong. It’s set to the date the scan was run, not the release date for the project. By looking at the timestamp on the source directory unpacked from the .tar.gz file we unpacked, we know that the release date was actually Nov 15, 2018.

Getting this right for each system will be useful later when we want to show trends across the project in the CodeMRI web interface. Note: If you are using the web interface, run the ‘export_web_data’ job after setting all the release_date fields properly

 

Let’s make the change:

We can also do this without ‘selecting’ a single system (or multiple systems) to operate on and instead use the --selection flag for the command to do so:

Modifying settings in bulk - ‘release_date’ example

Let’s make the dates correct for all the Axis2 systems in the Data Vault. Note that the directories have UNIX timestamps indicating the release date for each version. In this example, we’ll parse these dates and use them for inputs into CodeMRI.

Let’s rerun the ‘ls’ command to get the exact style we entered into CodeMRI previously for the ‘release_date’ setting:

Now we’ll write some UNIX script code to generate the CodeMRI command we want

 

Let’s confirm that this worked:

Note that the same thing can be done to set release_date for Apache_HTTPd because the .tar.gz files unzip with valid release dates for that project as well. For early versions of Linux, this doesn’t work. Instead you will have to get dates by looking at the download sites.

From here, you can copy the contents of the download pages into an editor write some scripts to produce the correct CodeMRI commands

Here are some commands you could use to set release_date for the Linux systems without going through the process above

 

Note: If you are using the CodeMRI web server, run the ‘export_web_data’ job after setting all the release_date fields properly to update the views