The following procedure will remove C style comments (i.e. /* .... */ ) from text.
proc removeComments { text {replacement ""} } { regsub -all {[/][*].*?[*][/]} $text ${replacement} text return $text }
If you need to remove C style comments that are imbedded (i.e. /* ... /* ... */ ... */) use the following procedure.
proc removeImbeddedComments { text {replacement ""} } { set text [string map {"/*" \x80 "*/" \x81} $text] while {[regsub -all {\x80[^\x80\x81]*?\x81} $text ${replacement} text]} {continue} set text [string map {\x80 "/*" \x81 "*/"} $text] return $text }
Use Examples:
removeComments ${data} "#comment-removed#" removeImbeddedComments ${data} "#comment-removed#"
Test Cases:
##### Simple Comments ##### # test-1 /**/ /* */ /* text1 */ # test-2 text1 /**/ text2 /* */ text3 /* comment */ # test-3 /* */ text1 /* */ text2 /* */ # test-4 text1 /* */ text2 text1 /* */ text2 # test-5 /* comment */ /* comment */ /* comment */ ##### Imbedded Comments ##### # test-1 text1 /*/*/**/*/*/ text2 # test-2 text1 /*/**//**//*/**//**//**/*/*/ text2 # test-3 text1 /* comment /* comment /* comment */ comment */ comment */ text2 # test-4 text1 /* text2 text3 /* comment */ text4 /* comment comment /* comment */ comment */ text5 */ text5 # test-5 text1 /* comment /// /* comment /// /* comment /// comment *** */ comment *** */ comment *** */ text2 # test-6 text1 * / / * /* comment /// /* comment /// /* comment /// comment *** */ comment *** */ comment *** comment /// /* comment /// /* comment /// comment *** */ comment *** comment /// /* comment /// comment *** */ comment *** */ comment *** */ text2 # test-7 (dangling comments) */ /*
Test results from the removeImbeddedComments procedure were as follows.
##### Simple Comments ##### # test-1 #comment-removed# #comment-removed# #comment-removed# # test-2 text1 #comment-removed# text2 #comment-removed# text3 #comment-removed# # test-3 #comment-removed# text1 #comment-removed# text2 #comment-removed# # test-4 text1 #comment-removed# text2 text1 #comment-removed# text2 # test-5 #comment-removed# #comment-removed# #comment-removed# ##### Imbedded Comments ##### # test-1 text1 #comment-removed# text2 # test-2 text1 #comment-removed# text2 # test-3 text1 #comment-removed# text2 # test-4 text1 #comment-removed# text5 # test-5 text1 #comment-removed# text2 # test-6 text1 * / / * #comment-removed# text2 # test-7 (dangling comments) */ /*
Pierre Coueffin (03 Sept. 2005): You do have to be careful if you try to use this on actual comments in C code.
if 0 {
removeComments {printf ("/* %s */\n", "Comment to print"); /* Prints a comment to stdout */}
returns:
printf (" \n ", "Comment to print");
where you might expect to see:
printf ("/* %s */\n", "Comment to print");
}
tbtietc - 2009-06-25 11:19:05
<enter your comment here, a header with nick-name and timestamp will be insert for you> regsub -all {('(^\'|\\.)')|("(^\"|\\.)*")|(//^\n*)|(/\*(^*|*^/)*\*/)} $text "\\1\\3" text;
This detects: A. Character in single quotes B. String in double quotes C. C style comments.
And replaces: A, B with themselves (quotes intact). C with null-string (comments deleted).
For example, given text as the following C code: // I hope this is going to /* be detected as a */ comment. /* Similarly, this too should be detected // as a comment. */ /* This is a /* comment.*/ /* A quote uses " and a single quote uses ' */ /* This one is a single-line comment */ /* This
* is a * multiple-line * comment */
/* This one has 2 comments */int ttt; //arranged like this. int main () {
int /* Comment */ a = 10; ///*comment*/10; char *s1 = "http://www.google.com"; char *s2 = "/* This is a comment. */"; char *s3 = "A quote is this \""; char *s4 = "A single-quote is this '"; char ch = '"'; char ch2 = '"';
}
Output: <8 blank lines> int ttt; int main () {
int a = 10; char *s1 = "http://www.google.com"; char *s2 = "/* This is a comment. */"; char *s3 = "A quote is this \""; char *s4 = "A single-quote is this '"; char ch = '"'; char ch2 = '"';
}