Programming Techniques

Started by
17 comments, last by nobodynews 11 years, 9 months ago

Ten lines seems very small. Some of my functions have gotten smaller as I've worked on them, but the average is more than ten lines. I asked for advice about the code here a few months ago and no one said anything about the functions being too long.

Aesthetic criticism is generally discouraged, because it is so subjective. Thus, I'm not telling Spirrwell "your functions are too long", just that "I've generally found that shorter functions make for more readable code, and readability is really nice."
As an example (I hope you don't mind, I'll remove if you do):
[source lang="cpp"]unsigned char GiveOne ()
{
if (index >= packetLength) {
throw failure("ReceiveBuffer::Give -- out of data");
}
unsigned char byte = buf[index];
++index;
return byte;
}[/source]
This function is extremely readable, all statements are at the same level of abstraction, and a developer who knows little to nothing about your system can immediately understand the logic. On the other hand:
[source lang="cpp"]
sock_type connect2CMW (cmwa_config_data const& configData) {
sock_type sockfd = cmw::getSocket(SOCK_STREAM);
#if 0
sockaddr_in SockAddr;
SockAddr.sin_family = AF_INET;
SockAddr.sin_port = htons(56791);
SockAddr.sin_addr.s_addr = inet_addr("10.0.0.184");
if (::connect(sockfd, (sockaddr*)(&SockAddr), sizeof(SockAddr)) == 0) {
return sockfd;
}
#else
for (;;) {
#ifdef CMW_WINDOWS
int const basePort = 56791;
#else
int const basePort = 56789;
#endif
::std::string node(configData.localhost ? "127.0.0.1" : "www.webEbenezer.net");

for (int j = 0; j < 6; ++j) {
addrinfo *res;
getaddrinfo_wrapper(node.c_str(), basePort + j % 2, &res);

if (::connect(sockfd, res->ai_addr, res->ai_addrlen) == 0) {
freeaddrinfo(res);
#ifdef SYSLOG_AVAILABLE
syslog(LOG_INFO, "connected to CMW on port %d", basePort + j % 2);
#endif
return sockfd;
}
// 111 is connection refused and 113 is no route to host.
printf("connect() failed on port %d. errno is %d\n"
, basePort + j % 2, GetError());
freeaddrinfo(res);
}

#ifdef CMW_WINDOWS
Sleep(configData.sleepseconds * 1000);
#else
sleep(configData.sleepseconds);
#endif
}
#endif
}[/source]
this one has a few different levels of abstraction (low level OS code, med-low level application code?), is quite a bit longer, and performs several steps in an apparently complex operation, but I wouldn't be able to readily identify the boundaries or proper high level names for those steps without being more familiar with the domain. Again, I'm not saying it's bad, incorrect, or anything like that, just that subjectively, I am less able to grok this than I (think) I would be if you were to do something like:

[source lang="cpp"]
sock_type connect2CMW (cmwa_config_data const& configData) {
sock_type sockfd = cmw::getSocket(SOCK_STREAM);
//Not sure if you sometimes need to enable that disabled code? Still under development?
for (;;) {
::std::string node(configData.localhost ? "127.0.0.1" : "www.webEbenezer.net");
int port = GetBasePort();
for (int j = 0; j < 6; ++j, port += j%2) {
addrinfo *res;
getaddrinfo_wrapper(node.c_str(), port, &res);

if (::connect(sockfd, res->ai_addr, res->ai_addrlen) == 0) {
freeaddrinfo(res);
AttemptLogCMWConnection(port);
return sockfd;
}
freeaddrinfo(res);
}
SleepUntilRetry(configData);
}
void AttemptLogCMWConnection(int port) {
#ifdef SYSLOG_AVAILABLE
syslog(LOG_INFO, "connected to CMW on port %d", port);
#endif
}
void PrintCMWConnectionError(intport) {
// 111 is connection refused and 113 is no route to host.
printf("connect() failed on port %d. errno is %d\n"
, port, GetError());
}
void SleepUntilReconnect(cmwa_config_data const& configData) {
#ifdef CMW_WINDOWS
Sleep(configData.sleepseconds * 1000);
#else
sleep(configData.sleepseconds);
#endif
}
int GetBasePort() {
#ifdef CMW_WINDOWS
return 56791;
#else
return 56789;
#endif
}
[/source]
I may have mangled the logic, and again, I am not implying that this is better (unless your compiler is really good, it may be slower for instance). Hope this illustrates what I mean better though smile.png
Advertisement
Hmm, the code formatting didn't work, so it looks like I've made an unmitigated mess of your code. This is troubling.
Edit: Fixed


Decompose functions. Every function should be short (2-10 lines) and every statement should be at the same level of abstraction. This will do wonders to avoid duplication and (if you name your functions well) dramatically improve readability. If you have a long function, and you're not sure how to split it up, first just chunk it into blocks, then look at what each of those blocks does. Whenever you end up with two short functions that are similar, try to make them identical and then remove one.


That's a little extreme. You shouldn't break a function up into many smaller functions unless you need to. You seem to be advocating doing this for the sake of doing it.

That's a little extreme. You shouldn't break a function up into many smaller functions unless you need to. You seem to be advocating doing this for the sake of doing it.

Sorry if it wasn't clear, my advice was given in the spirit of "if you aren't sure how to acquire the taste for function decomposition, then..." and not as a general practice. I believe there are few, if any, situations where a long, complex process with multiple levels of detail should be all thrown into a single function, but it takes some forced practice to determine when a function will benefit from a split.
So, to be clear, I am not saying that all functions must be a particular length, or that they should be split arbitrarily if no obvious distinction arises. What I am saying is that obvious distinctions are often present, and there are advantages to capitalizing on them.

Edit: Also, I don't believe "you need to" do anything with regards to styling in code, rather, some styles promote code reuse, some promote readability, some promote maintainability, etc. In my experience, function decomposition is a very beneficial style, and to only employ it only when I am compelled would be to forego its benefits altogether.

[quote name='Spirrwell' timestamp='1341599698' post='4956420']
Again thanks, but I ask again, why would I remove the typedef from the struct?

You can leave the typedef there.
But nowadays in C++ it's common to use "struct A" instead of "typedef struct A". The later is more C style.
There is difference of them in C, but not in C++.
http://stackoverflow...def-struct-in-c
[/quote]
No. Don't use typedef there, it buys you nothing but more trouble. That answer you linked is wrong (it talks about C anyway).

You can't forward declare the type for starters.

also http://stackoverflow.com/a/1084204

I may have mangled the logic, and again, I am not implying that this is better (unless your compiler is really good, it may be slower for instance). Hope this illustrates what I mean better though smile.png


I like what you did with the port and have used that in this revised version.


for (;;) {
#ifdef CMW_WINDOWS
int port = 56791;
#else
int port = 56789;
#endif
::std::string node(configData.localhost ? "127.0.0.1" : "www.webEbenezer.net");
for (int j = 0; j < 6; ++j, port += j%2) {
try {
sock_type sockfd = connect_wrapper(node.c_str(), port);

#ifdef SYSLOG_AVAILABLE
syslog(LOG_INFO, "connected to CMW on port %d", port);
#endif
return sockfd;
} catch (::std::exception const& ex) {
// 111 is connection refused and 113 is no route to host.
printf("%s", ex.what());
}
}

#ifdef CMW_WINDOWS
Sleep(configData.sleepseconds * 1000);
#else
sleep(configData.sleepseconds);
#endif
}



Probably I'll take your advice on adding a function for the sleep stuff also. The new connect_wrapper function takes care of the getaddrinfo stuff so those details aren't showing up like they were. I'm using connect_wrapper in this (middle) tier and in the front tier. After the change, the size of the middle tier binary stayed the same and the size of the front tier was reduced about 200 bytes. So that seems fine.
You may be able to do the same thing with syslog:

void syslog_opt(int priority, const char * format, va_list ap)
{
#ifdef SYSLOG_AVAILABLE
syslog(priority, format, ap);
#endif
}

I wouldn't be surprised if this were optimized out on systems without syslog being available.

C++: A Dialog | C++0x Features: Part1 (lambdas, auto, static_assert) , Part 2 (rvalue references) , Part 3 (decltype) | Write Games | Fix Your Timestep!


You may be able to do the same thing with syslog:

void syslog_opt(int priority, const char * format, va_list ap)
{
#ifdef SYSLOG_AVAILABLE
syslog(priority, format, ap);
#endif
}

I wouldn't be surprised if this were optimized out on systems without syslog being available.


Thanks. I'm not sure if you were using a pseudo code form there or if there's another way to do it. This is the only way I know how to do it.

void syslog_opt(int priority, const char * format, ...)
{
#ifdef SYSLOG_AVAILABLE
va_list varg;
va_start (varg, format);
vsyslog(priority, format, varg);
va_end (varg);
#endif
}


That adds 92 bytes to the executable, but I think I'll keep it.

Thanks. I'm not sure if you were using a pseudo code form there or if there's another way to do it. This is the only way I know how to do it.

Let's call it pseudo code :). I should have mentioned I was guessing at the syntax. I knew it was possible I just forgot how it was done.

C++: A Dialog | C++0x Features: Part1 (lambdas, auto, static_assert) , Part 2 (rvalue references) , Part 3 (decltype) | Write Games | Fix Your Timestep!

This topic is closed to new replies.

Advertisement