nimble-code / cobra Goto Github PK

View Code? Open in Web Editor NEW

133.0 133.0 29.0 18.96 MB

An interactive (fast) static source code analyzer

Shell 21.00% C 64.82% Yacc 13.67% Makefile 0.51%

cobra's People

Contributors

Stargazers

Watchers

cobra's Issues

False positive: "Inconsistent checks of function return values"

In C++ files, using "extern C {" causes some rules to fire when they shouldn't, such as the "Inconsistent checks of function return values" in the Basic rule set. The rule seems to be interpreting the function definition as an invocation, probably because of the opening brace in the "extern C".

To reproduce:

$ cobra -C++ -f basic -comments t1.c

Input file:

#ifdef __cplusplus
extern "C"
{
#endif

bool
rcl_node_is_valid(const rcl_node_t * node)
{
  bool result = rcl_node_is_valid_except_context(node);
  if (!result) {
    return result;
  }
  if (!rcl_context_is_valid(node->context)) {
    RCL_SET_ERROR_MSG("rcl node's context is invalid");
    return false;
  }
  return true;
}

rcl_node_get_options(const rcl_node_t * node)
{
  if (!rcl_node_is_valid(node)) {
    return NULL;  // error already set
  }
  return &node->impl->options;
}

#ifdef __cplusplus
}
#endif

Results in:

=== Inconsistent checks of function return values:
 1: rcl_node_is_valid checked 1 times out of 2

setting `$C_BASE` in `.profile` gives a weird error

I try to set rules folder through $C_BASE. But get a weird problem.

Any idea this relates to the program itself?

The problem is as follows:

Download Cobra and set the path along with these two in .profile file (it is an AWS machine)

echo COBRA='~/workspace/Cobra' >> ~/.profile
echo C_BASE='$COBRA/rules' >> ~/.profile
exit

/home/ubuntu/workspace/Cobra/rules

It seems correctly set but the following error says otherwise. Go to the Cobra folder and issue the command cobra -terse -comments -f basic src/cobra_links.c.

error: cannot open ~/.cobra : check tool installation
cobra: cannot find 'basic'

Set it exactly same again, but with export in the console.

export C_BASE=$COBRA/rules
echo $C_BASE
/home/ubuntu/workspace/Cobra/rules

$C_BASE seemingly has exactly the same value, alas it is working now.

cobra -terse -comments -f basic src/cobra_links.c
=== Fct names also used as variables: 4
=== Modifying the control variable of a for loop inside the loop body: 1
=== Missing else at end of if-else-if chain: 4
=== Missing default in switch statement: 3
=== Functions defined more than once: 0

Infinite loop in CWE rule definition

Running the command cobra -comments -f cwe <filename> where <filename> has the contents:

#include <stdlib.h>

void func()
{
    void *a=malloc(1);
    if(NULL != a) {
        memset(a, 0, 1);
        free(a);
    }
}

results in an infinite loop. Problematic rule seems to be cwe_416.cobra. This applies to revision ab06e67.

Null pointer dereference upon `cobra -pat`

Running cobra -pat (without any further arguments) results in a null pointer dereference. Probably just due to missing command-line option validation.

Cobra/src/cobra_prep.c

Lines 612 to 614 in 5c9525b

    
           char * 
        
           pattern(char *p) 
        
           {	char *n = (char *) emalloc(2*strlen(p)+1);

$ cobra -pat
Segmentation fault: 11

With AddressSanitizer compile-time instrumentation enabled:

$ cobra -pat
ASAN:DEADLYSIGNAL
=================================================================
==38178==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fffb3941b52 bp 0x7fff57c9be00 sp 0x7fff57c9be00 T0)
==38178==The signal is caused by a READ memory access.
==38178==Hint: address points to the zero page.
    #0 0x7fffb3941b51 in strlen (libsystem_c.dylib:x86_64+0x1b51)
    #1 0x108193817 in wrap_strlen (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x15817)
    #2 0x107f76f68 in pattern cobra_prep.c:614
    #3 0x107f78dfd in main cobra_prep.c:942
    #4 0x7fffb390b234 in start (libdyld.dylib:x86_64+0x5234)

False positive: "Do not use dynamic memory allocation after task initialization"

Using either the jpl or p10 rule sets, the following code erroneously generates the "Do not use dynamic memory after task initialization".

To reproduce:

cobra -C++ -comments -f jpl t1.c
cobra -C++ -comments -f p10 t1.c

Where t1.c is:

bool
rcutils_allocator_is_valid(const rcutils_allocator_t * allocator)
{
  if (
    NULL == allocator ||
    NULL == allocator->allocate ||
    NULL == allocator->deallocate ||
    NULL == allocator->zero_allocate ||
    NULL == allocator->reallocate)
  {
    return false;
  }
  return true;
}

The resulting output is:

=== R5: Do not use dynamic memory allocation after task initialization: 9
t1.c:6:
  1:      6      NULL == allocator ||
t1.c:7:
  2:      7      NULL == allocator->allocate ||
t1.c:8:
  3:      8      NULL == allocator->deallocate ||
t1.c:9:
  4:      9      NULL == allocator->zero_allocate ||
t1.c:10:
  5:     10      NULL == allocator->reallocate)
	globals used in one scope only:   0
	globals used in one file  only:   0
=== R16: Nr of statements: 2
=== R16: Nr of assertions: 0
=== R16: the minimum number of assertions is 2% = 0
1 errors

(Potential) False positive: "caller does not check return value"

Using the jpl ruleset, this rule indicates that the code does not check the return value. However, the calls return the return a duration and are then used in a boolean expression. Not sure what checking could happen on the return values.

Also, the rule complains about this:

  return rmw_time_total_nsec(left) == rmw_time_total_nsec(right);

but not this:

  rmw_duration_t d1 = rmw_time_total_nsec(left);
  rmw_duration_t d2 = rmw_time_total_nsec(right);
  return d1 == d2;

Which are semantically equivalent.

To reproduce:

  cobra -C++ -comments -f jpl t.c

Where t.c contains:

#include "rmw/time.h"
#include "rcutils/time.h"

RMW_PUBLIC
RMW_WARN_UNUSED
bool
rmw_time_equal(const rmw_time_t left, const rmw_time_t right)
{
  return rmw_time_total_nsec(left) == rmw_time_total_nsec(right);
}

RMW_PUBLIC
RMW_WARN_UNUSED
rmw_duration_t
rmw_time_total_nsec(const rmw_time_t time)
{
  static const uint64_t max_sec = INT64_MAX / RCUTILS_S_TO_NS(1);
  if (time.sec > max_sec) {
    // Seconds not representable in nanoseconds
    return INT64_MAX;
  }

  const int64_t sec_as_nsec = RCUTILS_S_TO_NS(time.sec);
  if (time.nsec > (uint64_t)(INT64_MAX - sec_as_nsec)) {
    // overflow
    return INT64_MAX;
  }
  return sec_as_nsec + time.nsec;
}

Results in:

=== R13: declare data objects at smallest possible level of scope
	globals used in one scope only:   0
	globals used in one file  only:   4
	bool	used in only file /root/src/spaceros_ws/src/rmw/rmw/src/t.c
	RMW_PUBLIC	used in only file /root/src/spaceros_ws/src/rmw/rmw/src/t.c
	RMW_WARN_UNUSED	used in only file /root/src/spaceros_ws/src/rmw/rmw/src/t.c
	rmw_duration_t	used in only file /root/src/spaceros_ws/src/rmw/rmw/src/t.c
=== R14f: caller does not check return value: 1
t.c:9:
  1:      9    return rmw_time_total_nsec(left) == rmw_time_total_nsec(right);
=== R14g: caller does not check return value: 1
t.c:9:
  1:      9    return rmw_time_total_nsec(left) == rmw_time_total_nsec(right);
=== R16: Nr of statements: 6
=== R16: Nr of assertions: 0
=== R16: the minimum number of assertions is 2% = 0
1 errors

Invalid match

Call:

# cobra -recursive "x.c" -json -pe "^struct . { .* @type x:@ident ^:x* }"

Code:

typedef struct lprint_dymo_s            // DYMO driver data
{
  unsigned      ystart,                 // First line
                yend;                   // Last line
  int           feed;                   // Accumulated feed
} lprint_dymo_t;

Result:

[
  { "type"      :       "^struct . { .* @type x:@ident ^:x* }",
    "message"   :       "lines 1..6",
    "file"      :       "./x.c",
    "line"      :       1
  }
]

Real world occurrence: michaelrsweet/lprint#35

Some toying shows that the comment ruins the match. If we delete the comment on the typedef struct ... line then the false match goes away.

Iridex Ruleset Question

Hello,

Can you tell me more about the iridex ruleset? Is it derived from a standard?

Thanks!

Cobra hangs on a simple test file

Setup

Create a file named test_code.cpp (or something similar) with the following contents:

// Expect A16_0_1
#ifdef SOME_VAR
#define SOME_OTHER_VAR 5
#endif

// Expect another A16_1_1
#ifdef SOMETHING
#define SOMETHING_ELSE 10
#endif

int main(int argc, char *argv[]) {
    // Expect A18_1_1
    int my_arr[1024];

    // Expect another A18_1_1
    char some_buf[256];
}

This is obviously nonsense code, but it reproduces these issues for me.

Execution

Run /path/to/cobra -C++ -comments -json -f C++/autosar /path/to/test_code.cpp, replacing /path/to with the real absolute paths to the cobra binary and test_code.cpp files, respectively.

Expected result

Cobra outputs some issues to stdout, and reports them in various output files.

Actual result

The execution of the cobra binary hangs indefinitely.

Consistent user control of C preprocessing

Currently, when using the MISRA 2012 ruleset, Cobra invokes the C preprocessor, even without the -cpp option being specified. To make it more consistent, and easier to integrate with the Space ROS CI system, I recommend only running the preprocessor if the user requests -cpp.

How to match any expression in two locations?

Motivated by a use of memset() ; free() in linux-pam, I made a cobra pattern to detect these instances (which should use memset_s):

    cobra --cpp -pat '{ .* memset ( x:@ident , .* , .* ) .* free ( .*:x ) .* }

Sadly this does not work. The x identifier isn't usually an x at all but something like a->b->x or a->b->c->x etc. I've worked around the issue by allowing more false positives:

cobra --cpp -pat '{ .* memset ( .*x:@ident , .* , .* ) .* free ( .*:x ) .* }

That is, we now will complain about memset(a->x, 0, N); free(b->x);. Is there a more general match I should be using than ident which makes something akin to the first attempt work?

Improvements requested for the json_convert SARIF output

Besides the fixes in this PR: #48

There are a few items that would make the SARIF output more useful in an IDE:

The rules/fullDescription/text appears to be truncated. For example, text descriptions are strings like "Missing" and "Inconsistent". It seems like some full description string has been truncated to be only the first word.
According to the standard, the rules/messageStrings/default/text item should be a complete sentence ending in a period. It is current a string like, "lines 238..238". This should be a more complete sentence describing the rule.
There is no "artifacts" entry

Here is a screen capture from the Visual Studio Code SARIF plug-in that demonstrates the first two items:

`json` command line option question

Hello,

How is the json and json+ command line option used? I have tried the following, but there is no json file saved.

cobra -json -f p10 test.c

Broken Pattern?

I think I found a broken pattern but it could easily be my lack of understanding of the pattern language.

Consider:

void func3()
{
    void *x;
    curl_easy_getinfo(x);
    x = curl_easy_init();
}

It seems we should be able to match this patter of f(x) ; x = g() or a more generalized f(x ...) ; ... ; x = g() using the pattern:

cobra -pat '{ .*  curl_easy_getinfo ( .* x:@ident .* ) ^:x* :x = curl_easy_init ( ) .* }' b.c

But this yields no output. It seems that generalization from f(x) to f(x,y) or f(x,y,z) really threw us for a loop because this does work:

cobra -pat '{ .*  curl_easy_getinfo ( .* x:@ident ) ^:x* :x = curl_easy_init ( ) .* }'

Notice we removed .* after x:@ident.

This is certainly contrary to the english specification on the website ".* matches zero or more" since there are zero tokens and .* should never make a match that did succeed suddenly fail. Is it a bug? My misunderstanding? Is there another way to approach the pattern?

Segmentation fault in 3.5 but not in 3.1

Hello,

I am seeing a Segmentation fault (core dumped) error in Cobra 3.5 that isn't present in 3.1.

This only occurs for two rulesets -- MISRA 1997 and MISRA 2004.

misra1997

:m97_rule82
eval: '(.txt != } || .curly > 1)'
:m97_rule83
Segmentation fault (core dumped)

misra2004

:m04_rule16.5
:m04_rule16.8
Segmentation fault (core dumped)

The other included rulesets run without issue.

Can't link Cobra - multiple sysmbol defintions of tokrange and t_id

[root@perft src]# make install_linux
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_prim.o cobra_prim.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_prep.o cobra_prep.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_heap.o cobra_heap.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_lib.o cobra_lib.c
yacc -o cobra_eval.c cobra_eval.y
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_eval.o cobra_eval.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_fcg.o cobra_fcg.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_lex.o cobra_lex.c
yacc -d -p xx -o cobra_prog.c cobra_prog.y
yacc: 8 shift/reduce conflicts.
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_prog.o cobra_prog.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_sym.o cobra_sym.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_cfg.o cobra_cfg.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_te.o cobra_te.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_links.o cobra_links.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_array.o cobra_array.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_list.o cobra_list.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -c -o cobra_json.o cobra_json.c
cc -I. -Wall -O2 -pedantic -Werror -Wshadow -std=c99 -DYY_NO_INPUT -o cobra cobra_prim.o cobra_prep.o cobra_heap.o cobra_lib.o cobra_eval.o cobra_fcg.o cobra_lex.o cobra_prog.o cobra_sym.o cobra_cfg.o cobra_te.o cobra_links.o cobra_array.o cobra_list.o cobra_json.o -pthread
ar -r c.ar cobra_lex.o cobra_prep.o cobra_prim.o cobra_heap.o cobra_links.o cobra_json.o
ar: creating c.ar
cp c.ar ../src_app
cp ../doc/cobra.1 /usr/local/share/man/man1
cp ../doc/cwe.1 /usr/local/share/man/man1
cp ../doc/find_taint.1 /usr/local/share/man/man1
cp -f cobra ../bin_linux
cp -f ../gui/* ../bin_linux
cd ../src_app; make clean install_linux
make[1]: Entering directory '/opt/ncode/Cobra/src_app'
rm -f *.exe *.o cwe abstract scope_check binop cfg deref fct_param_counts flatten float ident_check ident_length ifelseif igrep lf misra2004 nomacros nr_cases rule23_rule31 switch_default find_taint
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe.o cwe.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_util.o cwe_util.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_119.o cwe_119.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_120.o cwe_120.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_131.o cwe_131.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_134.o cwe_134.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_170.o cwe_170.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_197.o cwe_197.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_468.o cwe_468.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_805.o cwe_805.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_416.o cwe_416.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -c -o cwe_457.o cwe_457.c
cc -Wall -pedantic -Werror -Wshadow -O2 -DYY_NO_INPUT -I. -std=c99 -o cwe cwe.o cwe_util.o cwe_119.o cwe_120.o cwe_131.o cwe_134.o cwe_170.o cwe_197.o cwe_468.o cwe_805.o cwe_416.o cwe_457.o c.ar -pthread
/opt/rh/gcc-toolset-10/root/usr/bin/ld: c.ar(cobra_prep.o):(.bss+0x20): multiple definition of t_id'; cwe_util.o:(.bss+0x0): first defined here /opt/rh/gcc-toolset-10/root/usr/bin/ld: c.ar(cobra_prep.o):(.bss+0x0): multiple definition of tokrange'; cwe_util.o:(.bss+0x8): first defined here
collect2: error: ld returned 1 exit status
make[1]: *** [makefile:45: cwe] Error 1
make[1]: Leaving directory '/opt/ncode/Cobra/src_app'
make: *** [makefile:64: install_linux] Error 2
[root@perft src]#

Compiler version:
[root@perft src]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/rh/gcc-toolset-10/root/usr/libexec/gcc/x86_64-redhat-linux/10/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/gcc-toolset-10/root/usr --mandir=/opt/rh/gcc-toolset-10/root/usr/share/man --infodir=/opt/rh/gcc-toolset-10/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-10.3.1-20210422/obj-x86_64-redhat-linux/isl-install --disable-libmpx --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.3.1 20210422 (Red Hat 10.3.1-1) (GCC)
[root@perft src]#

OS Version
[root@perft src]# cat /proc/version
Linux version 4.18.0-348.7.1.el8_5.x86_64 ([email protected]) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-3) (GCC)) #1 SMP Tue Dec 21 19:02:23 UTC 2021
[root@perft src]#

I changed the makefile to put the man pages in /use/local/share/man, but made no other changes to makefiles or source

Both issues look like look like clashes between src/cobra_prep.c and src_app/cwe_util.c and src_app/scope_check.c.
I changed the definitions at the top of src_app/cwe_util.c to externs, as well as the tokrange definition in src_app/scope_check.c, and now everything links.
Please let me know if there's something more I can do to help.

Minor typo on website

http://spinroot.com/cobra/downloads.html

Has a typo: "Cobra is be available starting on Github at github.com/nimble-code/Cobra."

Better JSON output

In the JSON output when using the -json flag, there are a couple issues

Remove trailing commas on the "line" field:

 [
  { "type"	:	"/define @ident \( .* x:@ident .* \) [. (]* ^\( :x",
    "message"	:	"lines 20..22",
    "file"	:	"dynamic/pam.c",
    "line"	:	20,
  },
  { "type"	:	"/define @ident \( .* x:@ident .* \) [. (]* ^\( :x",
    "message"	:	"lines 19..19",
    "file"	:	"libpam/pam_env.c",
    "line"	:	19,
  }
]

Don't escape characters in "type" field as it produces invalid JSON. The only characters that should be escaped should be double quotes and backslashes
When there is no match for a pattern, I would expect the empty array [], instead I get:

[
  { "type"	:	"/restore \( .* \) { .* }",
    "message"	:	"no matches found",
    "file"	:	"stdin",
    "line"	:	0,
  }
]

Support the JUnit XML format for Cobra output

In order to integrate with the Jenkins CI/CD tool, it would be helpful to have Cobra output detected issues in the JUnit XML format, which is understood by Jenkins.

Infer finds some bugs in Cobra

I ran Muse (i.e. infer, shellcheck) on Cobra, results here.

The main findings are:

efree not actually freeing memory - seems intentional, but certainly means there are leaks.
Some dead stores - most are uninteresting (int ret=0 ; ... ; ret = some_call(); ... ; return ret). However one so far does appear more like junk code that is dead due to refactor or perhaps missing logic (here https://github.com/TomMD/Cobra/blob/ce88343992eb7f16c4b67ca108033b037d0f6354/src/cobra_lib.c#L2202).

Multiple input directories CLI

Is it possible to provide multiple input directories to Cobra via the command line?

As and example, I'd like to be able to do something like this:
cobra -v -f basic inc/*.[h] src/*.[c]

This would allow me to maintain the same directory structure as my project and run Cobra more easily.

It's possible I just don't know the correct syntax and Cobra will already support this.

Thanks.

about python comment lines

Although -Python` flag gives Cobra the ability to recognize python keywords, it does not tokenize comments.

I could write a simple (naive) fix for single-line comments like this into cobra_lex.c:981:

		case '#':
			if(python)
			{
				p_comment("#",cid);
				continue;
			}

but multi-line comments, triple quotes """, require double lookup for characters ahead of the quote ", the string marker. I could not comprehend how to implement it, I guess we need a new complementary next2char(int cid) for this purpose.

Can you check the above code, and comment for both!?

problem with spaces in file/folder names while reading from a file list with "-F"

Another problem with spaces. I tried to see if I may find the problem and fix it, but it seems I am still not good at C.

File names are read correctly (prep_args) from the file list but only the last part of the name of a file, which is composed of non-space characters, continues deep into the program (so a fully non-spaced path is read correctly). spaced folder name/spaced file name.c shortens to name.c and gives cannot open file 'name.c' error.

I saw that there is a statement in cobra_prep.c:

// but the filename may contain spaces
// provided they are escaped with a backslash

I then tried to prefix spaces with backslashes, spaced\ folder\ name/spaced\ file\ name.c. But now either there is an infinite loop running with this change or something just gets stuck because program does not ever end. To compare this "never ends" I used a single file with a single char in it so it resolves in a second if the file is read correctly.

YILMAZ

Warnings don't go to stderr

I get a warning for a pattern expression that goes to stdout which causes issues with parsing

$ cobra -json -pe "{ .* memset ( ^)* x:@ident , ^)* , ^)* ) ^:x* free ( ^):x ) .* }" a.c
warning: is a space missing before : in '):x'?

This specific warning is also a false positive.

Pattern match slightly complicated malloc free example

Hi,
First of all, thanks for sharing this wonderful project with the community and also for the continuous improvements and rules library .
I am using cobra to find out memory leaks in our product. I was able to catch a few leaks using simple query.

find . -name "*.c" | xargs cobra -pat '{ .* cmsMem_alloc ^cmsMem_free* }'

While this works for basic cases where alloc() and free() are in same scope, I have other scenarios for which could you please suggest the pattern/command that would work.

void fun()
{
    if((x = cmsMem_alloc()) == NULL) {
    return;
}

if(...) {
   .....
   cmsMem_free(x);
}
else {
   .... 
   // this is a macro which internally calls 
   //  cmsMem_free() and also assign NULL to x.
   CMSMEM_FREE_BUF_AND_NULL_PTR(x); 
}
    return;
}

It is possible some functions use either of the free implementations or both. So,

I want to match the malloc and also check if either variant of free exists after it.
It should search this pattern within a function scope and not block scope.
Also, if I could additionally match with identifier x, it would be great. This also reduces false positives, in case there are multiple alloc, free statements within a function.

Do you think this pattern is correct without identifier?
find . -name ".c" | xargs cobra -pat '{ . cmsMem_alloc ^(cmsMem_free|CMSMEM_FREE_BUF_AND_NULL_PTR)* }'

Can we add ident as well here?
Kindly share if this is feasible?

in `cobra_prim.c`, assertion fails occur, `$ARGS` is empty and `$FLAGS` might be used in wrong order, version 3.8

Somewhere else in the code with update 3.8 changed the behavior for check_args, or they were silent bugs awaiting this moment.

I run the program with a dummy empty file with a few flags cobra -cpp -terse nospace.c .

if I try only !$ARGS, the following assertion error comes up.

Assertion failed: strlen(c)+strlen(p) < n (cobra_prim.c: check_args: 213)

I have added a printf statement before that line and also changed it to ... < n+1 to see a result. I have found that this assertion failure happens because $ARGS is now empty. the assertion passes only when there are more characters after $ARGS.

: !$ARGS                 // assertion fails
c:0 p:0 n:0

: !echo $ARGS         //assertion fails
c:0 p:5 n:5

: !echo $ARGS.       // assertion passes if there is any extra character after, includes a single space
c:0 p:5 n:6
.

for the use of $FLAGS, I found this by accident while trying the following command. at first, I thought it was $ARGS filled wrong, but since it is already empty, then it is only the $FLAGS being redirected in the wrong order

: !echo $FLAGS
-cpp -terse

: !echo C: $COBRA , A: $ARGS and F: $FLAGS.
c:37 p:6 n:72
C: /workspace/Cobra/rules/../bin , A: -cpp -terse and F:

lastly, if I use $FLAGS before $ARGS (which it normally should be), the command is cut short after the use of $FLAGS

: !echo C: $COBRA , F: $FLAGS , A: $ARGS
c:37 p:4 n:65
C: /workspace/Cobra/rules/../bin , F: -cpp -terse

I hope the solution will be easy and fast because especially the rulesets using scope_check and similar shell spawns are now compromised with this empty $ARGS.

PS: I was trying to see your changes for filenames containing spaces, but this problem happens with no spaces too.

bad field type 'fct'

Hello,

I'm getting "bad field type 'fct'" when trying to run the basic ruleset. Any ideas?

`cobra_prep.c`: `-cpp` flag and filenames with spaces (have solution, need a check)

trying to check another problem (I reported moments ago) I accidentally stumbled upon another problem. Hopefuly, I have a solution for this one, but will be better if you check it.

when -cpp flag is used, apparently a command is spawned to use CPP (set to use gcc) within cobra_prep.c. If used with filenames including spaces, the command tries to process as if multiple file names are given.

in cobra_prep.c line:230, wrapping filename placeholder with quotes will invoke the command with the correct filename.

snprintf(buf, n, "%s %s -w -E -x %s \"%s\" > %s",
			    CPP, preproc, lang, f, fnm);

The reason I need you to check is that I am not sure if you spawn this command for a single file or for multiple files. If it uses only a single file name then this fix should be fine

Allow for recursive file patterns

On a repository with a nested file structure, common with java projects, there is no easy way to specify running Cobra across these files.

I suggest having the ability to run cobra like cobra --recursive "*.java" option or the ability to use recursive file globing such as **/*.java

Output file not valid JSON

I believe the intention is for the output file of Cobra to be a valid JSON file (although it currently has a .txt extension).

To demonstrate that Cobra sometimes does not produce valid JSON:

Create two files, t1.c and t2.c with the following contents:

#include <stdio.h>

int main()
{
  int x;
  int y;

  printf("%d:%d\n", x, y);
  return 0;
}

The run cobra on these files with:

$ cobra -C++ -comments -json -f p10 t1.c t2.c

The contents of the output file (_P10_.txt) will be:

[
  { "type"      :       "Rule 5: use minimally two assertions per function on average",
    "message"   :       "lines 3..10",
    "file"      :       "t2.c",
    "line"      :       3,
    "cobra"     :       "1 2 22"
  }
  { "type"      :       "Rule 5: use minimally two assertions per function on average",
    "message"   :       "lines 3..10",
    "file"      :       "t1.c",
    "line"      :       3,
    "cobra"     :       "1 2 22"
  }
]

Where there is not a comma between the entries.

JSON output misses commas

when processing code and producing a json output file, the json file misses the commas as separators between objects representing individual findings.

call:

cobra -f misra2012 -json+ -recursive '*.c'

result:

[
  { "type"	:	"(Required) The atof, atoi, atol and atoll fcts of <stdlib.h> shall not be used",
    "message"	:	"lines 268..268",
    "source"	:	"{	e->val = atoi(e->s);",
    "file"	:	"./cobra_array.c",
    "line"	:	268,
    "cobra"	:	"1 6 0"
  }
]
[
  { "type"	:	"(Required) The memory allocation and deallocation fcts of <stdlib.h> shall not be used",
    "message"	:	"lines 751..751",
    "source"	:	"a->ht = (Arr_el **) emalloc(H_SIZE * sizeof(Arr_el *), 9);",
    "file"	:	"./cobra_array.c",
    "line"	:	751,
    "cobra"	:	"1 10 0"
  }
  { "type"	:	"(Required) The memory allocation and deallocation fcts of <stdlib.h> shall not be used",
    "message"	:	"lines 748..748",
    "source"	:	"{	a = (Arr_var *) emalloc(sizeof(Arr_var), 8);",
    "file"	:	"./cobra_array.c",
    "line"	:	748,
    "cobra"	:	"1 8 0"
  }
  { "type"	:	"(Required) The memory allocation and deallocation fcts of <stdlib.h> shall not be used",
    "message"	:	"lines 735..735",
    "source"	:	"{	statstring[i] = (char *) emalloc(512 * sizeof(char), 7); ",
    "file"	:	"./cobra_array.c",
    "line"	:	735,
    "cobra"	:	"1 11 0"
  }
...

interactive session does not show any keystrokes for re-compiled version on alpine linux

Hi there,

I try Cobra on an Alpine Linux on Docker and installing gcc, musl-dev, byacc, and make are enough to compile it (or at least there is no compilation errors).

However something is seemingly missing and I could not get the interactive session to show anything I write (not even the colons are there). If I blindly write without any typo or copy-paste a command, it has no problem working. It is just that the curser sits still on the left-most side and does not move for any key stroke until enter key pressed.

Also, this problem is not happening on an Ubuntu container.

I tried to check the source code and libraries installed on Ubuntu and Alpine. Unfortunately, I failed to isolate the problem. I also tried to understand source code and failed in that too.

It might be solved by a few additional library installation, or might be purely dependent on how Alpine linux is made. In the latter case, clearly there is nothing to solve the issue.

Is there anything you can think of?

PS: The reason I try to work on Alpine is its small docker image size to work on while on Windows. And by the way, precompiled linux binaries do not work in Alpine.

trying to use $COBRA and $ARGS in background shell causes failed assertions

Executing commands on host while in Cobra is nice

! c execute command(s) c in a background shell

But I have found this issue while checking for another one. So far I could run many shell commands with ! and had no issue other than trying to use $COBRA and $ARGS.

: !echo $ARGS
cobra: cobra_prim.c:235: check_args: Assertion `strlen(c)+strlen(f->s) < n' failed.
Aborted (core dumped)
----
: !echo $COBRA
cobra: cobra_prim.c:241: check_args: Assertion `strlen(c)+strlen(p) < n' failed.
Aborted (core dumped)

I have seen there is an old post from 2019. Since you marked it as solved and I could not relate that to these two, I opened this new issue.

My other issue, #34, might be irrelevant but also might be directly caused by this one, So you may want to check them together.

Thanks
YILMAZ

False positive: "Fct names also used as variables"

Using the Basic rule set, when a function name is provided to a macro, the rule incorrectly identifies this as using the function name as a variable. I've noticed that this is only for the first occurrence in the function (the second macro invocation does not provoke the warning).

To reproduce:

$ cobra -C++ -comments -f basic t1.c

Where t1.c is:

rcl_ret_t
rcl_client_init(
  rcl_client_t * client,
  const rcl_node_t * node,
  const rosidl_service_type_support_t * type_support,
  const char * service_name,
  const rcl_client_options_t * options)
{
  TRACEPOINT(
    rcl_client_init,
    (const void *)client,
    (const void *)node,
    (const void *)client->impl->rmw_handle,
    remapped_service_name);

  ANOTHER_MACRO(
    rcl_client_init,
    (const void *)client,
    (const void *)node,
    (const void *)client->impl->rmw_handle,
    remapped_service_name);

   return 0;
}

Results in:

=== Fct names also used as variables: 1
t1.c:11:
  1:     11      rcl_client_init,

spaces in filenames causes "scope_check" in rulesets to fail

I have some files with spaces in their names.

I was getting errors because names are split before use and seen as if there are multiple files. I first thought it was my shell scripts causing the issue. Instead of using wildcards, I pull all file names with find and process them within a loop.

But after fixing issues and making sure scripts are fine, I traced it throught the rules and found this and similar lines in other rules all having the same issue: file's name is not read as a whole thus it complains with multiple "cannot open file" errors.

scope_check -N1 -c rn $FLAGS $ARGS

If I try to use this scope_check in console, with quoted names, then it works fine. But running it in rulesets, it gives this name split problem.

After a bit of hacking into cobra_prim.c (commented out 235 and 241), I could see $ARGS holds the name of the file. I tried adding quotes around it as I did in shell scripts, then it errors out saying "Unterminated quoted string".

For small number of file, or for files I own renaming is the solution to go. But that is applicable for multi-user or remote projects.

Since it is not common to have such names and not all rulesets use this scope_check, this is not such an urgent problem to fix asap. But, of course, would be nice to have it gone.

is it possible to increase `MAXYYTEXT` if file size is big, programmatically?

I have bumped into this assertion issue where Cobra stops at the beginning:

cobra: cobra_lex.c:450: p_comment: Assertion `i < MAXYYTEXT' failed

the file is only 107356 bytes long but has 2140 lines and 7287 tokens in it.

I have increased MAXYYTEXT to 3072 and recompiled Cobra to get it run. So it is possible to fix by this way.

But I wonder if, instead of failing, it is possible to increase this value in steps dynamically if the file size is above some thresholds, or how hard would it be to do so.

Restrict analysis to certain files

Is it possible to restrict analysis output to certain files only? There are a lot of library files that are part of the ARM CMSIS standard that cause Cobra to output matches, but that are not relevant for analysis since those files won't be changed. Is there a way to filter the output to only specific files?

False positive: "R13a: (related) - do not use single-letter global identifiers"

Using the jpl ruleset, a file with angle include directives (#include <stdlib.h>, for example) the rule is erroneously detecting use of a single letter identifier.

To reproduce:

$ cobra -C++ -comments -f jpl t1.c

Where t1.c contains:

#include "rmw/allocators.h"
#include <stdlib.h>
#include <string.h>
#include <rcutils/allocator.h>
#include "rmw/types.h"

extern void some_function();

Results in:

=== R13a: (related) - do not use single-letter global identifiers: 3
t1.c:2:
  1:      2  #include <stdlib.h>
t1.c:3:
  2:      3  #include <string.h>
t1.c:4:
  3:      4  #include <rcutils/allocator.h>
=== R13: declare data objects at smallest possible level of scope
	globals used in one scope only:   0
	globals used in one file  only:   5
	stdlib	used in only file t1.c
	string	used in only file t1.c
	rcutils	used in only file t1.c
	allocator	used in only file t1.c
	some_function	used in only file t1.c
=== R16: Nr of statements: 1
=== R16: Nr of assertions: 0
=== R16: the minimum number of assertions is 2% = 0

Assertion failure on Basic rule set

Hello,

I am seeing the following assertion error when running the basic rule set.

cobra: cobra_lib.c:600: clear_range: Assertion `z->upto > z->from' failed.

Any ideas what might be the cause?

Assertion `strlen(c)+strlen(f->s) < n' failed.

When running Cobra with the following command

cobra -v -f p10 *.[ch]

I get

cobra: cobra_prim.c:236: check_args: Assertion `strlen(c)+strlen(f->s) < n' failed.
Aborted (core dumped)

How to avoid multiple results between same lines

I am searching for pattern where I acquire and release lock within a function or method. I need to check if we returned before releasing a lock.
When cobra detects pattern failure then I want only largest block of code to be printed. Currently, it prints the largest blocks as well as smaller code blocks within that function or method.

cobra -pat '{ . ^acquire ^release* }' tack.c

Here i want to print only tack,c:805..830 and avoid other smaller blocks. Since I am searching on a lot of files, I am getting many patterns, I want to avoid going through same code repeatedly.

tack.c:805..830
   805  {
   806     IntfStackPropagateStaus *propagaeStatusMsg = (IntfStackPropagateStaus *) (msg+1);
   807     char statusBuf[BUFLEN_64] = {0};
   808     CmsRet ret;
   809  
   810     if ((ret = cmsLck_acquireLockWithTimeout(SSK_LOCK_TIMEOUT)) != CMSRET_SUCCESS)
   811     {
   812        cmsLog_error("failed to get lock, ret=%d", ret);
   813        cmsLck_dumpInfo();
   814        return;
   815     }
   816  
   817     if ((ret = qdmIntf_getStatusFromFullPathLocked_dev2(propagaeStatusMsg->ipLowerLayerFullPath,
   818                                                         statusBuf,
   819                                                         sizeof(statusBuf))) != CMSRET_SUCCESS)
   820     {
   821        cmsLog_error("getStatusFromFullPath failed for %s, ret=%d", propagaeStatusMsg->ipLowerLayerFullPath, ret);
   822        /* complain but don't exit */
   823     }
   824     else
   825     {
   826        intfStack_propagateStatusByFullPathLocked(propagaeStatusMsg->ipLowerLayerFullPath, statusBuf);
   827     }
   828  
   829     cmsLck_releaseLock();
   830  }
tack.c:807..830
   807     char statusBuf[BUFLEN_64] = {0};
   808     CmsRet ret;
   809  
   810     if ((ret = cmsLck_acquireLockWithTimeout(SSK_LOCK_TIMEOUT)) != CMSRET_SUCCESS)
   811     {
   812        cmsLog_error("failed to get lock, ret=%d", ret);
   813        cmsLck_dumpInfo();
   814        return;
   815     }
   816  
   817     if ((ret = qdmIntf_getStatusFromFullPathLocked_dev2(propagaeStatusMsg->ipLowerLayerFullPath,
   818                                                         statusBuf,
   819                                                         sizeof(statusBuf))) != CMSRET_SUCCESS)
   820     {
   821        cmsLog_error("getStatusFromFullPath failed for %s, ret=%d", propagaeStatusMsg->ipLowerLayerFullPath, ret);
   822        /* complain but don't exit */
   823     }
   824     else
   825     {
   826        intfStack_propagateStatusByFullPathLocked(propagaeStatusMsg->ipLowerLayerFullPath, statusBuf);
   827     }
   828  
   829     cmsLck_releaseLock();
   830  }
tack.c:811..815
   811     {
   812        cmsLog_error("failed to get lock, ret=%d", ret);
   813        cmsLck_dumpInfo();
   814        return;
   815     }
tack.c:820..823
   820     {
   821        cmsLog_error("getStatusFromFullPath failed for %s, ret=%d", propagaeStatusMsg->ipLowerLayerFullPath, ret);
   822        /* complain but don't exit */
   823     }
tack.c:825..827
   825     {
   826        intfStack_propagateStatusByFullPathLocked(propagaeStatusMsg->ipLowerLayerFullPath, statusBuf);
   827     }

"cobra -configure $COBRA/rules" sets wrong folder for rules

I have made a docker alpine-based image to work with cobra and compile it from source code.

I have hit the following problem.

$ export COBRA=/workspace/cobra
$ cobra -configure $COBRA/rules

> cobra: configuration completed

$ cobra -cpp -f basic *.[ch]

> cobra: cannot find 'basic'

I have checked the ~/.cobra and found out the configure command adds an extra "/rules" to the path

$ cat ~/.cobra 

> Rules: /workspace/cobra/rules/rules
> # ncore: 1

Opposed to what is in the README file, using only cobra -configure $COBRA solves the problem.

Please fix this either in the configuration method or in the README file.

PS: I don't know of your standing point for using docker. If you are positive and just did not have time for that, I can supply my dockerfile.

Support the SARIF output format for Cobra output

In order to integrate with tools that support a software development process envisioned for Space ROS, Cobra should support the SARIF output format for detected issues:

Jenkins integration is a higher priority (JUnit XML format output), but SARIF is also desirable. It would also be helpful to be able to output both file types for a single scan (so that the tool doesn't have to be run again just to produce a different output format).

Use of threads in Cygwin

Hello,

I noticed something interesting about using Cobra with Cygwin. When I specify the number of threads to be one (-N1), the processing is significantly faster, about 1.9x faster than without the flag in my testing.

From what I can tell this is a platform specific issue, but I wasn't sure if Cobra could run with -N1 by default in the case where Cygwin is detected.

P.S. I was also curious how many threads are spawned by default in Cobra.

extern.cobra script does not work even if inconsistent types are present

Test case tried:
declared variable of char type in one file and made an extern of same name and type int in a separate file. No error was returned.

in `cobra_prim.c` file, line 178, `/../bin` seems to be a left over, or!?

there is this line in cobra_prim.c file, line 178

	{	n += strlen(c_base) + strlen("/../bin") - strlen("$COBRA");

This seems to be a left-over and might lead to problems in long run.

It seems it was written with two things in mind:

binary files would be in $COBRA/bin folder
and rules defined by $C_BASE would be under $COBRA/rules

currently, there are two (again two) issues with this:

binaries are under $COBRA/bin_[mac|cygwin|linux]. The first issue can be solved by telling users to rename the binary folder for their system to bin.
one can change their rules folder with $C_BASE before running Cobra for any reason and then the address for bin will not be even pointing to Cobra directory.

If the purpose of this line is to point to bin folder for cobra, it needs some changes. Unfortunately, I don't have any proposal about that. Hope you will find one

PS: May I ask a question!? is nimble-code a team or a single programmer? I feel overwhelming you right now with these issues :)

No output files when using cwe and misra2012 rule sets

When using the -json option, there is sometimes and output file created and other times not, depending on the ruleset:

basic: _Basic_.txt
cwe: No output file
p10: _P10.txt (inconsistent naming, see #47)
jpl: _JPL_.txt
misra2012: No output file

I would expect that all rulesets would output JSON consistently and with a consistent output filename convention.

SARIF generation and duplicate generation

Hello again! I have one more issue with SARIF generation from this tool There seems to be a trailing comma in the rules field. My guess is that this line is the culprit, but I'm not familiar enough with c to fix it myself. One this (hopefully) last issue is fixed, then the SARIF is parse-able with other tooling 🥳

`cobra_json.c`: 'sprintf' output between 3 and 523 bytes into a destination of size 512

I haven't used JSON output format but the following warning is coming up when trying to re-compile.

cobra_json.c: In function 'check_bvar':
cobra_json.c:65:23: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
   65 |  { sprintf(buf, "%s %d", c->txt, c->lnr);
      |                       ^
cobra_json.c:65:4: note: 'sprintf' output between 3 and 523 bytes into a destination of size 512
   65 |  { sprintf(buf, "%s %d", c->txt, c->lnr);
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

I tried to increase the buffer to 1024 and it seems buffer size does not matter and there are always extra 11 bytes added before line 65:

note: 'sprintf' output between 3 and 1035 bytes into a destination of size 1024

I can't see the cause (not enough level), thus for now I continue compiling with CFLAGS= ... -Wno-error=format-overflow to get around it.

False positive: "deref of <x> preceds null test"

Using the basic ruleset, when a pointer variable is initialized in a declaration list, the rule incorrectly identifies it as a dereference, which results in the "deref preceeds null test" warning.

To reproduce:

$ cobra -C++ -comments -f basic t.c

Where t.c is:

char *
rcutils_repl_str(
  const char * str,
  const char * from,
  const char * to,
  const rcutils_allocator_t * allocator)
{
  char *pret, *ret = NULL;

  if (ret == NULL) {
    ;
  }

  return ret;
}

Results:

t.c:8 	deref of ret 	preceeds null test at line 10
Candidate reverse nulls: 2
t.c:8:
  1:      8    char *pret, *ret = NULL;
t.c:10:
  2:     10    if (ret == NULL) {
1 errors

	char *
	pattern(char *p)
	{ char n = (char ) emalloc(2*strlen(p)+1);

nimble-code / cobra Goto Github PK

cobra's People

Contributors

Stargazers

Watchers

Forkers

cobra's Issues

Setup

Execution

Expected result

Actual result

Recommend Projects

Recommend Topics

Recommend Org