Archive for May, 2008|Monthly archive page
How to code
Well, I think this post might start off my tech-blogging spree. That’s not to say much about the original spree. Three posts in two years’ time doesn’t quite fit the definition of one.
This one goes out to all of you out there who have just done 12th standard or typical first year B Tech C or C++ and, in my opinion, have a lot of things to unlearn. Some of you may find the content in this post shocking, some even offensive, but hey, I present facts as they are, and I didn’t write the standards documents.
Yes, there are standards documents for C and C++, very detailed guidelines regarding how to write clean, portable C and C++ code. Download them for free from http://www.open-std.org/.
Before starting, I’d also like to make a few disclaimers:
- I’ve tried to keep the content of this post as technically accurate and unambiguous as I could help it. Wherever appropriate, I’ve cited references, and most of the things I’ve cared to mention here are probably explained in more detail at the references.
- Being a C++ programmer myself, some of the content in this post might be C++ specific. Though the languages share many things in common, there are many subtle differences. I’ve made efforts to highlight these differences wherever necessary. On the other hand, having used C++ most of my life, when it comes to C, there might actually be differences that I don’t know about. It’s up to the reader to verify the authenticity of such content in this post. (http://www.c-faq.com/ might come in handy.)
- If you find some errors in this post, please post them as comments. I’ll do my best to verify and post the corrections here as early as possible.
The order in which topics follow is quite random. Most of them are unrelated, and you can probably read the ones that intrigue you the most.
Important terms:
I’m copy-pasting a few definitions with examples from the standards documents here.
Undefined Behavior:
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
For example, the behavior of your code on integer overflow, or the behavior of constructs like a = a++; are undefined.
Undefined behavior doesn’t mean that your code wouldn’t compile. It simply means that the standards specify no requirements on the behavior of your code. For all you know, your program might start baking cookies.
This is probably a good time to point out that just because your code compiles without errors, it doesn’t mean the code is absolutely right. Going a step further, protests like, “a = a++; seems to be working fine for me,” are meaningless, and are equivalent to saying, “I was playing soccer the other day, and threw the ball into the goal post with my hands, and it worked fine!”. That’s just not the way the game is played.
Try this page, for more information: http://www.eskimo.com/~scs/readings/undef.950311.html
Unspecified Behavior:
use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance.
For example, the order in which arguments are evaluated in a function call.
Implementation Defined Behavior:
unspecified behavior where each implementation documents how the choice is made
An example would be the propagation of the high-order (most significant) bit when a signed integer is shifted right.
Locale-specific Behavior:
behavior that depends on local conventions of nationality, culture, and language that each implementation documents
Example, whether islower() returns true for characters other than a-z is locale-specific.
Further, as far as C or C++ is concerned, the terms byte and char are, for the most part, interchangeable. Consider the following definitions as per the standards:
- sizeof returns the size of an object, or a type in bytes.
- A char is a single-byte character, this means that sizeof(char) is 1 by definition.
- A byte is an addressable unit of data storage large enough to hold any member of the basic character set of the execution environment.
Putting these pieces together tells you that a byte needn’t be 8 bits anymore. In fact, if I made a standards-compliant C compiler that works with Unicode, rather than ASCII, my chars would need 16 bits. But sizeof(char) is 1, and sizeof returns size in bytes. That means, for my implementation, a byte is 16 bits of data, not 8 bits. The number of bits in a byte is specified by the macro CHAR_BIT found in limits.h (climits, in C++).
Good code should try not to rely on the number of bits in a byte being 8. In most cases, you should be able to do without such assumptions. Use the CHAR_BIT macro if absolutely necessary.
Let’s begin, then.
Why main shouldn’t be void:
This is definitely going to be new to somebody who hasn’t seen books other than Kanetkar’s or Sumita Arora’s. void main() is wrong!
The explanation given in these books for such usage is something like, “we do not wish to return any values from main, so we mark it as void.” But main() isn’t any other function, now, is it?
As far as the standards go, main must return int, and only int: (this is with reference to a hosted environment: your C or C++ program runs with the help of an operating system)
5.1.2.2.1, C
The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent; or in some other implementation-defined manner.
The standards clearly specify two valid ways to define main, and they both return int. (And or equivalent simply means that int may be substituted by some other name typedefed to int, or that char *argv[] may be replaced by char **argv and so on.) In fact, a program that defines main in a way not equivalent to either of these specified forms invokes Undefined Behavior.
The question is, where does this return value go? Well, the value returned by main goes to the calling system, something that invoked your application. In many cases, this might be the operating system. It may also be some applications written by other programmers like you. The return value of main is a handy way to test whether your program executed correctly, a lot handier than having to look at the error stream of your application, and parsing it to figure out if something went wrong. Generally, a return value of 0 indicates success, and a non-zero value would stand for different error codes.
As an example, let’s say I’m making an installer, and you’ve already made an application to unpack archives. I can simply run your application with necessary arguments, and check the return value to see if your application extracted my archive correctly.
Sizes of structs:
Consider a struct defined as below:
struct MyStruct {
int a;
char ch;
};
If you’ve learnt that sizeof(MyStruct) would yield 3, it’s wrong. For one thing, the size of an integer is specified to simply be the natural size suggested by the architecture of the execution environment. This means that an integer can be 2 bytes or 4 bytes, or how many ever bytes as the implementation sees fit.
Now, is sizeof(MyStruct) == sizeof(int) + 1?
The answer is still no, thanks to something called structure padding. Compilers are free to pad structures with excess bits (or bytes) for optimization purposes. Generally, structure padding is done in a way as to align the objects with words of the system. On a 32 bit system, for example, this kind of padding would leave structure sizes to be a multiple of four bytes. Hence, the above struct can very well weigh in at 8 bytes.
See the wikipedia page that deals with this: http://en.wikipedia.org/wiki/Data_structure_alignment
http://www.goingware.com/tips/getting-started/alignment.html might prove to be a good read too.
No more conio.h:
There is no such thing as a conio.h as far as standard C or C++ goes. Or a graphics.h. DOS mode graphics are obsolete, and ought to be done away with.
No more clrscr(), either. Though system(“cls”); and system(“clear”); may prove to be alternatives.
But do you really want to clear the screen? On a terminal like Windows’, clearing the screen practically wipes out everything that the user had on his terminal. There is no way to retrieve the information (as far as I know). What if he’d spent the last few decades calculating the first billion gazillion digits of PI? (in which case, he ought to have redirected the output to a file, but, hey, what the heck? This is just an example. Besides, I doubt if a Windows machine could have such uptimes.) Would you wash it all away without even warning the chap? I wouldn’t.
Short circuit evaluation:
The logical and (&& ) and or (||) operators operate by what’s known as short circuit evaluation. (The last time I checked, Balagurusamy didn’t know about this.)
What this means is that they guarantee left to right evaluation of operands, plus:
- The && operator evaluates the expression on the right hand side only if the left hand side evaluated to true.
- The || operator evaluates the expression on the right hand side only if the left hand side evaluated to false.
Think of it as a kind of optimization. If the left hand side of an && is false, the result is going to be false, so there’s no point in evaluating the right hand side (who knows how long a function call might take, eh?). Same kinda thing goes for ||.
Besides possibly saving some time, there are a few notable consequences to this.
For example, thanks to short circuit evaluation, expressions like:
b != 0 && a / b < 100 p != NULL && p->value == 50
and so on become inherently safe.
You can also use handy expressions like:
strcmp(str1, str2) || cout << “Strings are equal!” << endl;
(I’d consider this a little difficult to read, especially for people who don’t know what short circuit evaluation is, and avoid this as far as possible.)
As another example, try:
int i = -1, j = 5; int k = ++i && ++j;
Does j get incremented? No.
(C++ only) structs can have member functions:
As far as c++ goes, structs and classes are the exact same thing. structs have public visibility by default, while classes have private. Other than this, possibly small, difference, you can do anything with structs that you can do with classes. Add member functions, inherit them, anything!
Why a = a++ is undefined:
This was an example I used earlier in the context of Undefined Behavior. Perhaps I can elaborate a bit more on why this construct invokes Undefined Behavior in this section.
The standards clearly say:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
This pretty much gets all different combinations like
a = a++; a = a++ * ++a; A[a] = a++;
and so on out of the way for good.
To understand the statement, however, we need to know about side effects and sequence points.
Side effects are basically changes of the state of the execution environment, like, for example, modifying an object, modifying a file, etc., or calling functions or using operators that involve these kinds of operations. For example, in a simple a++, the increment is a side effect.
Sequence points are points in the execution sequence where all side effects of previous evaluations have taken place, and no side effects of subsequent evaluations will have taken place. The end of a full expression, for example, denotes a sequence point.
In a = a++; consider the two consecutive sequence points, one immediately before and one immediately after the statement. You’re modifying a with an increment, as well as trying to assign a new value to a using the = operator. Two modifications between two consecutive sequence points, undefined behavior.
To be a bit plainer, just know that the exact point of time when the ++ increments a is not specified. It is guaranteed that the increment will occur before the next statement, and that’s about it. This leaves several possible ways in which the expression can be evaluated, all of which are completely valid, as far as the standards are concerned, two of which might be:
- Perform the assignment, and then the increment. In this case, the value of a gets incremented by 1.
- Store the original value of a in a temporary variable, perform the increment, and then assign the value in the temporary variable back into a. In this case, the value of a remains unchanged after the expression.
This kind of a discussion itself is pointless, but I’m including it here for those of you who need stout examples.
Why fflush shouldn’t be used on stdin:
int fflush(FILE *stream);
If stream points to an output stream or an update stream in which the most recent operation was not input, the fflush function causes any unwritten data for that stream to be delivered to the host environment to be written to the file; otherwise, the behavior is undefined.
This simply rules out any possibility of using fflush on input streams. And for good reason. What does flushing an input stream mean, anyway? Does it make sense? Flushing refers to writing out all the left over contents in a buffer. Why would you ‘write out’ contents in an input stream?
To clear the input stream, use a function similar to the one shown below:
void clear_stdin() {
char ch;
while((ch = getchar()) != '\n' && ch != EOF);
}
More intelligent use of scanf might save you the hazzle of having to clear the input stream at all. See http://www.cplusplus.com/reference/clibrary/cstdio/scanf.html and pay close attention to the section named ‘Whitespace character’.
I can access private variables with pointers!:
Such a statement simply betrays all the misconceptions you have about ‘data hiding’.
Firstly, an object is a physical entity that resides on your computer’s memory, so yes, a little bit of low level code can tell you what’s stored in it. So you’re working no magic here.
Secondly, when we say ‘data hiding’, we simply mean that keeping data members in private sections of your classes will save them from being altered accidentally. Keeping data private gives your class a certain level of confidence about the different states it might find the data in at any point of time, because after all, it’s just the class that can meddle with the data right? One needs to understand that such ‘data hiding’ is simply a kind of contract that you make, which details what are the right ways to work with a class, and what aren’t. It’s up to the users of the class to follow these guidelines while working with objects of the class.
What are string literals?
A double quoted string lying around in the code is a string literal.
For example, in cout << “ruggedrat” << endl; “ruggedrat” is a string literal.
What are the consequences?
Try
char *str = “ruggedrat”;
Now, str is a pointer to a literal. This means that any attempts to modify ruggedrat to ruggedbat, like, say, str[6] = ‘b’; invokes undefined behavior.
str is a const char * and you’d do well to declare it that way too. To create a mutable string, use one of the following:
char str[10] = “ruggedrat”; char str[] = “ruggedrat”; char str[10]; strcpy(str, “ruggedrat”);
or some equivalent.
I think that should do for now. There are a lot of other issues I could have taken up in this post, but my aim is neither to create a complete C++ reference here, nor to break the longest blog post record. This post is just aimed at directing people to good C and C++ coding practices, and showing where to find more information on them.
I’d conclude by saying that the compiler is just an executable. It’s not a foe you’re supposed to overcome. In fact, you’d do well at programming once you realize it’s quite the other way around. Efforts at cheating a compiler beats the purpose of having one, and better programmers would see them merely as a vulgar display of your ineptitude at using the language.
Throw away outdated books, and delete antique compilers. Get the standards documents, and start writing clean applications.
- A Rugged Rat exhausted from all the typing.
P.S. A few neat FAQs to learn from:
- When in doubt, go back to the source, Luke. Bjarne Stroustrup’s FAQs: http://www.research.att.com/~bs/bs_faq.html, http://www.research.att.com/~bs/bs_faq2.html
- C++ FAQ LITE: http://www.parashift.com/c++-faq-lite/
- Google: http://www.google.com
Encored!
Over the weekend, I was informed that I was to become part of Delta (™, I think, or maybe it was ®) Core, the Central Webteam1 of the college, the elite2 group that takes care of NIT-Trichy’s official website, and all their LAN services, among other things.
“Hey, RR, we decided next year’s core. It’ll be you and Bharath,” said the head honcho3. “Bhattu will be so surprised!” he added, somewhat shamelessly.
Sure he will. And I’m just a robot. I won’t be surprised. We robots are never surprised. At best, we might throw an exception.
Not extremely sure what to say to what might be construed a cold reception, I just nodded, and let out a squeaky, stifled, ‘yay’, which the head honcho pretended not to hear. Perhaps, I hoped, the casual welcome was designed simply to not throw me off my working spree. Having spent a really long year working on a Content Management System4, during the course of which, PHP and MySQL had threatened to become second nature, it was clear I’d go on a sabbatical at the faintest knock of much anticipated opportunity. A parade and a party to celebrate the occasion, I hoped he feared, might just do the trick.
Having gotten into the Central Webteam in such spectacular fashion, the next question was, what do I do once I’m in there? What’s in store for the next couple of years?
Those astrology websites proved to be of little use to this end. They were more worried about my relationships and finances than I was. Despite the numerous good things they said about me and promised would happen to me, neither have I had a relationship until now, nor have I touched a penny that wasn’t eventually spent on food.
A little digging showed that Delta Core happens to be the secret-keepers of the Octa LAN, for those of you who are Harry Potter fans. They hold the passwords to pretty much all of the servers lying around, and most importantly, the lovely Mac kept in one corner of Sun Lab that gracefully glints in the sunlight coming through the smoked windows, and even plays music.
One thing’s for certain, more coding, lots of decision-making, and several night-outs5 await, and we robots don’t complain. At best, we can throw an exception.
- Rugged Robot
1 Central Webteam: The Central Webteam of NIT-T. More information here: http://www.nitt.edu/home/students/clubsnassocs/computing/delta/webteam/
2 Elite: I like to call it that. Delta Core means a lot of work, and we don’t get paid. I gotta have something to hang on to.
3 Head honcho: Find a more detailed description here: http://parijat-cybertechie.blogspot.com/2008/02/adventure-that-is-life.html (paragraph 6)
4 Content Management System: The Pragyan (™, I think, or maybe it was ®) CMS. Woo hoo! It’s on SourceForge: http://sourceforge.net/projects/pragyan/
5 Night-outs: A sleepless night; generally spent outside one’s hostel, in most cases, at the Octagon6.
6 Octagon: NIT-Trichy’s Computer Center7.
7 Ok, enough, already!
What’s in a Name?
Since in my first post, I’d talked about myself and my interests (maybe without the trailing s), and in a way, given sufficient excuse for my existence, I think it’s time I looked at the blog, from a wider perspective, why I made one, what I’m planning to do with it, and most importantly, why it’s called what it’s called.
Names are, in a way, meant to reflect some aspect about the named, if not describe it in its entirety. This, however, does not mean that a guy named Goodman is a good man, and if, by any chance, he is, we must be inclined to accept that that might be purely coincidental. This observation perhaps directly follows from the fact that not much can be conceived about a person’s attitudes, behavior, manners, likes, dislikes, or anything else, as for that matter, when the person is an infant, or hasn’t been born yet, which is when most names are bestowed on their owners.
On the other hand, when it comes to naming things we create, we ought to pick something descriptive, lest we should look back at a later date, and wonder along the lines of, “what automobiles use carnot’s engine?” Wouldn’t it have been awkward if bicycles were called bags and vice versa? In fact, naming is so important that programmers like myself are advised to stick to guidelines. A variable named bWhetherIShouldHeatTheWaterAndThenPutInTheTeaPowderOrNot makes a whole lot more sense than an a or an i or an n.
So why “Rugged Rat”? I must say, I had a bit of thinking to do before I could write this one. I think a look at the kind of conversations I find myself in every now and then should throw some light on this one.
A typical conversation in the couch in front of the TV:
“Hey, bro, I think I need to do something about my physical condition.”
“What condition?”
“My point exactly…”
Let’s say a potential girlfriend gets a birthday wish from me at 12.00 am sharp. The next day’s conversation would go something like:
“Wow, RR, that was so romantic. I love you!”
“Well, you should… Do you know how much trouble I went through to make that client and schedule it to deliver your greeting?”
“You mean it wasn’t you? Why you… I hate you! I don’t wanna see you anymore!”
That, I guess, should explain “Rugged Rat”: I’m a rat for having been at one end of each of those kinds of conversations, and I’m rugged ’coz I have been at one end of those kinds of conversations. However, in all sincerity, the name has nothing to do with any of it: it was, and still is, just a variation of the name “rugrats”
Why I’m a Computer Geek!
What are you passionate about?
Movies? Are you the kind that, given the name of a movie, or maybe even its initials, starts reciting the entire cast and crew, the result of whose sweat and blood it is that you see on the screen? (I am, though I generally start with the actresses)
Soccer? Has never a club existed that managed to elude your scrutiny?
Music? Well, that’s been around ever since the caveman went about beating monkeys with a stick.
Quite frankly, if you’ve ever looked at somebody, and wondered why he or she is crazy about something, well, don’t. Everybody has their passions, and if you were thinking, “no, I don’t,” well, you’re the freak here.
And if you’ve ever looked at yourself and asked why you were crazy about something, I hope you’ve found some good enough answers.
I’m a programmer, and I make my living, oh wait, I don’t, I’m still in college. The one thing that drew me to computers so badly was that, well, the small box you keep on your desk, or on your lap, or even in your pockets packs more punch than world history, or geography, or perhaps both.
How does a package of circuitry, a maze of conductors dotted with diodes and transistors, a weird looking PCB that’s smart enough (or dumb enough) to recognize (just) 0s and 1s let you listen to music, watch a video clip and check the latest EPL results all at the same time? How do you get to read your mail, or print your resume? What makes your favourite games come to life? (If you answered, a GeForce, I’m with you, buddy!)
The answer lies in software (I’m sure your GeForce came with a CD). It’s those tit-bits of programming marvel that adds that sparkle to your otherwise worthless heap of hardware, that acts as an interface between man and machine (power extreme!*). Trying not to sound too technical, programming is that art that lets those gifted human beings communicate with a computer. Sounds wacky, but hey, you wouldn’t be here reading this without them. Any kind of (useful
) communication with your hardware needs to be done through software. Everything you see on your monitor is due to software. And well, like any energetic youth on the lookout for opportunities, I grabbed it by the throat, and now I’m called a programmer. I daresay the venture has taught me a thing or two about how puters function, among other things.
However, as is the case with many professions, being a programmer is one thing, being a good programmer is another. The learning process never ends, and the strife to become better never ceases. Every day somebody or the other teaches you something new. It’s this very fact that keeps things interesting.
Anyway, I think this’s enough on me. I’ll be keeping this blog alive with more (and much better, I promise) posts.
- a Rugged Rat starting off his blogging streak.
* The phrase “Becoming man and machine, power extreme” goes way back. Ever seen The Centurions? Consult the omniscient wikipedia: http://en.wikipedia.org/wiki/The_Centurions_(TV_series)
Leave a Comment
Leave a Comment
Comments (2)