Archive for the 'Programming' Category

21
Oct
09

File Descriptor Leaks

Sorry to have kept you waiting. I had been very very busy recently. For starters, I am trying to complete parts of my creative writing exercise in govz.wordpress.com. Anyways, a number of people from other countries (USA,Romania,Korea,China,India) have been hitting this blog entry for quite some time now. When I checked this blog’s history, a rather interesting set of questions or search filters were raised. Here are a few of them :

- “multi-thread file descriptor open close”
- “fork / clone file descriptor”
- “file descriptor leak debug”
- “creating a file descriptor”
- “how to reinitialize file descriptor c++”

In a way, I feel happy because, though this is an unsavory topic for most, it turns out this is also important to someone else, somewhere in the world. And by fate, they have somehow gained access to my blog and my mundane musings. I Just a want to repeat this again though, I am not and will never be a supreme-level engineer (the likes of Dan Saks, Bjarne Stroustrup, Dennis Ritchie, Jon B. Postel, Paul Vixie, Steve W. etc…), I am still a tremendous work in progress. Should you find my data suspect of flaw, please do leave a message. After all, learning is fun when more heads come together. :)

Also please take into consideration that the following discussion is

  1. limited to user-land level programming (meaning not within your Operating System or kernel)
  2. limited to systems with kernels which have management mechanisms for all types of resources (i.e memory, descriptors, etc…).
  3. limited to systems which frees all used resources on process termination.
  4. limited to UNIX or UNIX-like systems. (Sorry, windows developers. :( )

Ok? :D Here we go, “beam us up, scotty”. (Sorry, for a moment there, I missed, Kirk , Spock, Data, T-Pol and Janeway of Star Trek).


I. File Descriptor Fundamentals

Before we finally close the sample code of the previous blog, it is imperative that we are capable of understanding certain fundamentals about file descriptors.

A. In the beginning, there was and is the trinity.

Did you know that, when you start an ordinary program, you automatically open three file descriptors? These file descriptors are embedded within the streams are known to us, as stdin (standard input), stdout (standard output), and stderr (standard error). In an ordinary process, these three streams automatically obtain file descriptor id 0, 1, and 2.

So in a simple application, the maximum usable number of descriptors per process is equal to the process limit minus three.

Can you close them, yes certainly! “fclose(stdout)” closes the standard output. (We use fclose() because stdout is a stream or a FILE* data type.) It is pretty much a valid command. :) However, once stdout is closed, “printf” or “cout” will practically be useless, and in some instances, calling stdout-related commands in an stdout-disabled environment MAY cause the crash of an application.

For daemon developers however, if ever you are working on a BSD environment, we can chose to close these three file descriptors by simply setting the “noclose” parameter to a non-zero value in the “daemon()” command. Using “daemon(noclose=1)” the system points the three streams’ descriptor to “/dev/null”, causing all stdout-sent data to be sent safely to a NULL device. Linux daemon-izing on the other hand, is quite a tedious task that involves “forking” and a lot of other stuff that will eventually re-direct or re-open the three basic streams to a “/dev/NULL”.

“Redirection” in this context means that your output or input, is redirected into another device (or a file for that matter) aside from your laptop or PC screen. This is the reason why in any UNIX or UNIX-like systems, executing “ls -a > temp.txt” saves the result of the “ls -a” command to the file “temp.txt” . Anyways I just wanted to share some basic idea on IO redirection.

B. File Descriptor Acquisition Behavior

As far as my experience can take me, commands that generate file descriptors include (but are not limited to : open(), creat(), fopen(), freopen(), socket(), socketpair(), accept(), dup(), dup2(), and fcntl(). Be warned though, that some commands may not be supported in Linux or other UNIX and BSD-flavors. Also there may be other commands resulting to descriptor generation. My list only includes commands which I am aware of .

Now, as I have said, once we execute any of these commands, a new file descriptor is created and naturally the maximum number of usable descriptors for the process (and the system as a whole) is lessened by 1.

In almost all of the platforms I have worked with, once we ask the system for a file descriptor, the system usually returns the lowest possible file descriptor. So in normal conditions, where the triumvir streams are not closed, the first file descriptor you will get to acquire will be 3. Or if in case you have daemonized-close-all it, the return value will be file descriptor “zero”.

C. Forks, Clones and Virtual forks

For those who have not yet had the experience of “fork”ing, “vfork”ing, and “clone”ing then it is a privilege to welcome you to the world of process creation/duplication. Well, at the very least for C and C++. Be mindful again that the above commands may not be supported in your platform. And that threads and processes are entirely different in UNIX and UNIX-like systems. Anyways fork and other system calls of its kind, are called upon by one process, to create a new or child process. Its purely assexual though. :) There are rules and norms covering this “process forking” mechanism but I wont discuss them in detail here. :) There is only one thing I want to share though.

If a parent process, creates a child process via fork (or any of the above commands), the parent’s file descriptors are inherited by the child.

This is by design. But personally, I sometimes see this as an “inherited leak”. In a simplistic diagram, I would like to show you what happens after a child process is called :


                 ( P1 )
                    x -- start
                    |
      [fd1 = open("myfile.txt") = 3]
                    |
           [fd2 = socket() = 4]
                    |
           [fd3 = accept() = 5]
                    |                [child process shares descriptors 0-5]
        [create child :: fork()] - - - - - - - ->   ( child P2 )
                    |                                    |
                    |		         [p2fd1 = open("myfile3.txt") = 6]
                    |                                    |
                    |                                    |
                    x                                    x

In the diagram above , after a fork() is executed, Process 2 starts and opens another file. In this case the resulting file descriptor for P2′s open() call, is 6 and not 3. Also if the same file like “myfile.txt” is opened again in the child process, depending on the settings of the parent’s “open()” call for “myfile.txt”, an error might occur. But anyways, my point is simple, child processes inherit their parent’s file descriptor table. (Actually this inheritance is useful in shell-command execution as it provides a method for “piping” data from the shell to another process.)

But if you have no plans whatsoever to use any of the parent’s file descriptors, you can use the “close()” command in the child process to close specific file descriptors. Or better yet, you can use the “closefrom()” command, to avoid having to loop the closing of opened descriptors.

D. Process Transformation

Did you know that you can transform processes? If you did not know that, I welcome you yet again. More often than not, a “fork”-er, an individual who forks his process, has one ultimate desire. He or she desires to execute a different command or program. Converting one process into another program, or process transformation, is done by calling the “execv()” or “execl()” family of commands. Take a look at the diagram below. Try it out if you have a UNIX/UNIX-like system with you.


    ( P1 - system() )                       ( P2 - execv())
           x -- start                              x -- start
           |                                       |
           |                         [char *temp[2] = {"-a",NULL};]
           |                                       |
[printf("transforming.");]               [printf("transforming");]
           |                                       |
  [system("ls -a");]                    [execv("ls", temp);]
           |                                       |
 [printf("ls finished.");]           [printf("execv finished.");]
           |                                       |
           x end of program                        x end of program

I assure you 100% (provided execv has no error), you will never see the message “execv finished.” on your screen. Meaning P2 has been completely transformed. But in the system() command, the”ls”  is executed and then the parent process is continued. Making you see the “”ls -a finished.” message just after ls-a succeeds.

But as this blog is not about process transformation, I would just like to state  that even after process transformation, the file descriptor table data is persistent even after an execv() is called. Simply put, by default, the file descriptors opened before the execv() call still exists in the transformed process.(This is the default behavior for most systems, and of course if the files were open() -ed with default file-flag settings.)

This however can be avoided by setting the necessary flags via, “fcntl()” command. Using “fcntl()”, set the opened file descriptor’s close-on-exec flag (FD_CLOEXEC) just before you call execv(). With the close-on-exec flag enabled, the system automatically closes all descriptors in the process, whenever an execv() or execl() is executed.

E. Maxima – Maximum Descriptor Count
(I suddenly remembered my minima-maxima derivative mathematics hahaha … collectively known as extrema.). Anyways, for almost all UNIX based systems, getdtablesize() is supported. If your program needs to know the maximum number of file descriptors a process can open at any time, then you can use the getdtablesize() function.

Remember though that if getdtablesize() is 32, it follows that the lowest possible descriptor you might get is zero and the highest descriptor value you will get is 31.


II. File Descriptor Theory Conclusion

By now, I think the information above is enough for all of us to understand the holistic-overviewish-nature of file descriptors. This primer blog may not be complete but I am guessing this is enough to :

  • Spark your curiosity
  • Create a semi-complete mental image of file / fd usage in programming
  • And hopefully in some parts I have made you aware of some of the different things you can do in C/C++ like process generation and process transformation.

Within the next few days, I will upload my next blog, discussing my own file descriptor leak debugging techniques.

For the many C-developers out there, I wish you well. In a way, right now, Java seems to be taking over. But don’t worry, I think C and  C++ will ALWAYS be around. :) Good Day.

Up Next : File Descriptor Leak Debugging

free counters

05
Aug
09

File Descriptors

Of the four I have stated in the previous blog, I want to discuss first, (if ever there will be a second, I don’t know yet.) file descriptor leaks. I would like to share with you (or as a note for myself in general), an experience of mine on this type of leak. For the most part, I have written this blog so as not to forget what I have learned from the experience. I also wrote this in a way, to share some knowledge I have acquired over it.  i.e the difficulty of file descriptor leak debugging, the techniques I employed to fix and determine it, and the tools I used.  Plus some overall knowledge, I think I have acquired over a UNIX / Unix-like system.


Files, what art thou?

Files are the simplest forms of data storage. Of course, I think even those who don’t develop software are kinda familiar with this fact. But, (and it is a big BUT!) files are also a convenient and rudimentary method of signaling and/or synchronization. Developers like myself refer to this signaling mechanism famously as IPC – Inter Process Commnunication. For example, imagine two processes, Process1(P1) and Process2(P2). Let us say that before Process1 starts sending requests Process2 has to be ready.

P1 and P2′s synchronization mechanism can be simplistically described by the following :

    ( P1 )                          ( P2 )
      x -- start                      x -- start
      |                               |
 [initialize]                    [initialize]
      |                               |
      |                      [resource preparation]
      |                               |
      |                              [b] --- announce readiness.
     [a]  wait 'til P2 ready          |
      |                               |
      |                      [wait for requests]
      |                               |
      x -- start sending              x
           requests to  P2

True that with a simple libc::kill() command (yep, “kill” does not mean “terminate” all the time, my young friends.), P2′s existence can be checked, but the its state can never be determined (not unless P2 is synch-safe and has a complex signal handler in place). So in the above example, a file can be used for signaling/synchronization. P1 will routinely check for the existence of a certain file via libc::stat() or libc::access(), while P2 will be the one to create the file. So if we substitute [a] as “loop and sleep until the file /ipcs/process2_ready.dat exists” and [b] as “create /ipcs/process2_ready.dat until successful”, it makes a whole lot of sense right? Of course the flow has to be refined more though, but the gist is more or less like that. (Doubting Thomas : files are resident aren’t they? At next start-up, a false positive will occur at P1 since the file is still present. answer : that my friend is a trick i have to teach later. ).


Files are everything.

In Windows, this paradigm might not ring true. But you see, UNIX or UNIX-like Operating systems (FreeBSD, OpenBSD, Linux, etc…) have one simple yet fundamental precept. Everything is a file. (for Unix-like OS’es however, it is more like “Everything is almost a file.”) It follows too, that any UNIX based developer, has too be wary of each opened file descriptor one has in his process. File descriptors in the UNIX context is not just about “files” per se. Like what all those “better” guys tell you on the internet, in Unix/Unix-like systems, once you open a device, you have to acquire a file descriptor. When you libc::fork() a process or libc::clone() it, and use libc::pipe(), you use two file descriptors. For some though, this tidbit of knowledge may not be really something important, especially when multi-process-multi-thread operations are unnecessary. But for those who create “daemons” or “timing critical multi-process-multi-thread applications”, for a living, this piece of trivia might come in very useful.


What then is a file descriptor?

A file desriptor in all its simplicity is a numeric handle/value which represents a file you have opened. To put it in easy terms, a file descriptor is like an the Operating System’s (OS) translation of a very long file name. Let us take for example, the following file which has a name of :
–> “C:\\directory1\directory2\iamaveryveryveryverydveryveryverylongfilename.txt”

In the OS layer however, once the above file is opened, file1 is merely the process’ file descriptor : 3. Now wait a minute, why “3″ and not zero or one? Later I will explain :D . For the moment take it as it is. :)

The obnoxiously long file name is converted simply as file number : (3).  In C or C++ it can be done via the following line :

fd = open(“myfile.txt”, O_CREAT | O_WRONLY);  (C/C++)

Theoretically, libc::fopen() also does the same thing but the result of such a function is not really a file descriptor. libc::fopen() has a return type of FILE * but there is a way to retrieve its underlying file descriptor, use libc::fileno(). More on this later (when i can get back at it …. like 10 years from now.)


So What’s with file opening?

Though syntactically, file descriptor generation looks very simple, there are quite a few intricacies within the kernel / OS operation that goes along with it. The OS actually does a few stuff  during an “open()” call. Though I am unsure of the sequence, but in most Unix or Unix-like systems some of open()’s internal steps are the following :

1. OS checks the system limit of file descriptors and checks if it is full or not.
2. OS checks the process’ file descriptor table if it is full or not.
3. OS retrieves the lowest free index in the process table.
4. OS saves the data to the process table (i.e. file name, file position, etc…)
5. OS returns the file descriptor to the calling function.


Teaching By Example

“That’s easy, for every file you open, you close it.” The rule is correct. However, the mechanism on how you close and open a file matters more than anything else. Let us analyze the following code. In what case do u think will a file descriptor leak happen? For this exercise, let us assume that the  libc::close() call never fails.

/*********************************************************/
/*    Created by      : Gauvin L. Repuspolo
/*    What is this    : File Descriptor Leak Example
/*    My Birthday     : 1977/02/21 [just goofing around]
/*********************************************************/
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

#define SAMPLE_LAUGH   "hahahaha"
#define SAMPLE_CRY     "huhuhuhu"

int generic_fd = 0;

/****************************************************
/* name       : write_something
/* parameters : (input) is_laughing
/*              --> <= 0 - write SAMPLE_CRY
/*              --> >  0 - write SAMPLE_LAUGH
/* return     : on succes returns 0
/*            : on failure returns -1
/****************************************************/
int write_something(int is_laughing)
{
   int write_result =  -1;
   /* let us use a temporary pointer */
   const char *temp_ptr = (is_laughing > 0) ? SAMPLE_LAUGH : SAMPLE_CRY;

   generic_fd = open("my_file.txt",O_CREAT | O_RDWR | O_APPEND);
   if (generic_fd < 0) {
      return write_result;
   }
   /* since this is only an example, let's compare directly  */
   if (write(generic_fd, temp_ptr, strlen(temp_ptr)) == strlen(temp_ptr)) {
      write_result = 0;
   }
   close(generic_fd);
   return write_result;
}

int main()
{
     /* let us assume that the  close() call never fails.
     create a sample content of main() that uses write_something()
     and then in the end generate a file descriptor leak.
     good luck .... */
     return 0;
}

Up next … fixing file descriptor leaks …

free counters

15
Jul
09

OS Resource Leaks

Anyone who works on an embedded platform or even in PC applications, should probably by now understand full well the implications of a resource leak.  Before we start delving into this matter however it is imperative that we have a full grasp of what types of resources the Operating System (OS) provides to a “user-land” application. For starters let me give you four resources which could possibly be leaked during execution of your program:

  • Memory leak – The most celebrated of all the leaks. Simplistically, the failure to inform the OS that you are finished using a memory area thus making the OS reserve that memory area for as long as the CPU is not reset or a “memory map” release is not automatically executed. This applies to malloc(), calloc(), realloc(), opendir(), mmap() and others of the like.
  • Thread leak – An OS can only allow a certain number of threads running at the same time for a particular process. Exiting (libc::exit()) unnecessary threads, is a MUST. For they share together with the other threads your process’ precious time quantum. And with special reference to thread-based implementations of server applications (like FTP, HTTP, LDAP, Discovery etc…), you may end up unable to serve further requests should you fail to properly account for your threads.
  • Process leak -like thread leaks, unnecessary processes should either be “killed” or “SIGTERM”ed. Don’t tell me you love rogue processes and good-for-nothing-zombies? Failure to properly end a process may result in an OS executing unecessary context switches for processes that dont really matter. Like threads, when your “server” or “daemon” application is serving multiple requests via “fork()”, “vfork()” or “clone()”, you might end up unable to server further requests from your clients if unnecessary processes are left to run idly.
  • File Descriptor / Handles Leaks - The operating system also limits the number of files that an application can open simultaneously. So if you have got a leak on this one, you are looking forward to an unnecessary debugging adventure. For starters, some open source libraries, automatically assert() when they fail to take a hold of a valid file descriptor.


Why are leaks dangerous?

A leak in the software realm is not like a leaky pipe that eventually floods up your room if left unfixed over long periods of time. (I left our faucet once fully open overnight and the next day the house was really a mess! The water distribution service bill was almost as messy.) A resource leak in software is pretty much the opposite though. Once there are leaks in certain areas of a process, be it memory or something else, that same process or other newly started programs, will eventually find themselves failing in certain system calls which would have succeeded in an otherwise zero-leak running environment.

Summarizing, “Leaks result to only one thing, the eventual depletion of resources of an otherwise perfectly “enough” system. “Enough” because theoretically, except for some cases, what an OS provides is enough for everything it was designed to do, and that include the facilitation of user-land processes.


What are the consequences of leaks?

The thing I hate most about leaks, is that it has the ability to affect other processes.  And in which case if they do, it is almost virtually impossible to detect. Think of a room full of people where somebody suddenly silently farts. Everyone suffers but when and where it happened nobody really knows. (hahahahah! PEACE!)

They too are extremely difficult to debug and it takes an enormous amount of time to pinpoint their exact location. Fixing them is not really the problem in most cases, but finding out where and when they happen is the most difficult of all.  Enumerating some effects :

  1. System fails to allocate memory. (libc :: malloc(), calloc(), etc…)
  2. System fails to start a new thread. (POSIX : pthread_create())
  3. System fails to start a new process. (libc :: fork(), vfork())
  4. System fails to open new files. (libc :: open(), fopen(), dup(), etc…)
  5. System fails on system calls. (libc :: socket(), pipe(), etc….)
  6. Applications start to run really really slow.
  7. Applications suddenly crash due to low level assert().
  8. You enter the debugging twilight zone with a ticket to the universal competition for patience. Just kidding.

Probably in simple programs, you dont have to worry too much. But if you are developing an all-too-powerful daemon  which has to exist while the system is online, (or as you morph from intern engineer level to non-assistant level), you have got to be paranoid of leaks. “Resource Leaks = MESS” remember that.


Isn’t my OS the sky-is-the-limit version?

Yes dudes and dudettes! No OS is “mugen” – meaning “infinite” in Japanese.  Let me cite some example just for fun (though completely unrelated and utterly useless) did you know that :

  • A server based encrypted data, is most likely be valid for only 5 minutes?
  • that the real limit of your system time or the Y2K bug is “2/7/2036″? and the UNIX Y2K bug in 2038?
  • The limit of a USB cable is around 3 – 5 meters depending on the speed you use?
  • That  a NETBIOS system browser  has to routinely list the domain every 12 minutes?

Now going back to the topic, OS resources are pretty much the same. How they came up with the limits, is I guess an arbitrary science. Limits are most probably based on a careful balance of available memory/actual resources versus the rough average of “extreme usage” and “ordinary usage”. For the moment, I do not question it because, as they are for me “enough”. Plus the fact that so many scientific minds have no major complaints about it, shows that it is well in a way “enough”.

Resources like file descriptors for MOST operating systems have process-wide and system-wide limitations.  Process-wide means that for each process you create, there is a limited amount of file descriptors it can open simultaneously. Or for multi-threading process,  thread count limit is the maximum amount of threads a process can run simultaneously.  System-wide limits however, is the count of all the particular in-use resource, regardless of the parent process.

Let us take for example some older versions of linux which can open up to a maximum of 256 files per process, and roughly 1K system-wide. Therefore, for as long as the 1K system-wide limit is not breached, any process can gain access to 256 files simultaneously at any given time. But should there be 4 other processes, each opening 250 files at the same time, then the 5th process cannot use its full 256 file limit anymore. (Check this out.)

Anyways, if you happen to be running on top of a unix platform, you might want to try “ulimit -a” bash command in your terminal, to see certain limits of your Operating system.


Can these limits be changed?

Yes. However, it is important to note here that changing a per-process limit be it a hard or soft limit might require some special process, like recompiling your kernel. For the most part though, commands like setrlimit(), ulimit() or sysctl() (via libc::system() command) can be called within the program to modify certain soft and hard limits. Note also that setting a particular hard limit to an unreasonable value and then allowing a process to go beyond the hard limit might cause the system to break down eventually. Besides, setting the limit for a particular resource, will never be a solution for a resource leak!

Next Up …  file descriptor leaks …

free counters

01
Jun
09

Embedded Software Paradigm (1)

One day in an undisclosed NASA facility…..

NASA Boss        : “Can somebody please go to MARS”
NASA Engineer    : “What for?”
NASA Boss        : “Upgrade the code for the MARS Rover?” (fictional)

For the general populace, software is just software, a set of instructions which dictates a computing device’s behavior. However for those of us who are initiated, “firmware”, is a bit different from streamline PC-application-development. Although they are fundamentally the same, I think that firmware development mentality is a bit different. I hope I did not lose you there. And please take note, I never said firmware is more difficult.

“A BIT BUT SIGNIFICANTLY DIFFERENT”

PC-Applications and firmware are differentiated by only one aspect. They are differentiated, if not stating the obvious, only by the environment/device on which they should run on. Or more technically what we call as the “target”. It follows too that firmware, and its development, is subject to the nuances and frailties of an embedded  device. Please refer to previous blog’s [1][2]and [3]. And these “nuances” have a  direct and significant effect on the development mentality and process. This blog aims to discuss tangibly and simply, certain paradigm adjustments of each “embedded device limitation”.

A. Embedded devices generally, is “lesser” than the PC.
The term “lesser” here pertains to a multitude of aspects. For the moment let us limit it to computing power, and functional/program memory.

1. Effect of having Lesser computing power.
“Computing power” is really a very difficult subject. If you want to be as technical about it, and if you don’t mind to nosebleed a little, you can refer to “The Computer Engineering Handbook By Vojin G. Oklobdzija”. For the mean time, think of “computing power” as the computer’s “chi” or life force. The more you have it the more it can pump up the computing process. The lesser you have of it, the slower the system process.

Almost a decade ago, we had a a problem for a micro-controller. We were to ensure that at any point in time, our design should be able to save whatever temporary data there was inside the buffers into an EEPROM (Electrically Erasable Programmable Read-Only Memory).  On normal operating conditions we really didn’t have a problem. The problem however, occurred when the device lost power from its battery source.  Our design was to be good enough save the “cached” data.

Our solution then was two fold. (a)Impliment a fast Interrupt Service Routine (ISR) which executed the “save-data” mechanism and (b) the save-data mechanism was to beat the decay rate of the systems power supply.

Implementing the ISR was not the problem. The problem really was the execution speed of  our “store-data” function. Simple analysis showed that the speed of such a “transaction” relied mainly on :
1. the speed of the cpu or its computing power. (“how fast can it execute one command”)
2. the amount of time for one “write” transaction in the EEPROM. (“write-cycle”)
3. the amount of cached-data to be saved.

Anyways, I hope you catch my drift. Lesser computing power in the above example was really bane for what we had in mind. If only our micro-controller could execute billions of transactions in one second, then i think we would have had lesser factors to worry about. But then again, a better controller would have skyrocketed the cost of materials.

2. Embedded devices have lesser functional memory

Memory here is categorical and does not refer to the number of images inside your mobile phone or digital SLR, nor the number of emails your Blackberry Smart Phone can keep. “Memory” here refers to the number of features / capabilities a computing device has in its disposal.

To emphasize further, let us say, that the brain remembers all the different skill sets you have. Like riding a bike, treading in water, writing a haiku, so on and so forth. The PC generally speaking, has enough brain space to store all these skill sets (plus more) and thereby allowing it to do a lot of things. Embedded devices on the other hand can only muster three or four. Some can store only one functionality.

A few years back, my boss wanted me to implement one special mechanism in our machine. For confidential purposes let us call it the “Lightning In A Bottle (LIAB)” mechanism. However LIAB’s original implementation was on top of the Java platform. For those unfamiliar with JAVA, simply think of it as the all useful VELCRO strap. Wherein if the opposite material is fibrous or “loop-full” enough, it will surely adhere to the trusty Velcro “hooks”.

Java is like that. A powerful programming base, developed with the “build / code once, run anywhere” principle in mind. Once you create a program in JAVA, you are almost ensured that it will run anywhere that supports JAVA. Whether it be MAC or INTEL or ZAURUS or what have you.

Anyways, during that time, my boss had two proposals for me :
1. Implement a simplified Java Virtual machine (create a special velcro strap)
2. Create my own version of the LIAB mechanism.
Due to the amount of risks option 1 entailed, I told my boss that option 2 is the best.

Bottom line, bar the very small technical disparity in my example,  an embedded device has so limited a functional memory that theoretically,  “the number of ways to catch a mouse” is very limited. Needless to say reinventing the wheel is not a “rare reality”.

*technical disparity – the differentiation was based on JAVA. But theoretically (“purist-tically”) this should be microprocessor  to microprocessor, or architecture versus architecture.

free counters

… Paradigm changes because Embedded Devices are usage-specific …

12
May
09

Embedded Engineering

I had been an embedded engineer for the past ten years. For the first half, I was mainly involved in simple hardware  development, and as for software, I delved with Simple programming (Simple Control Loop) specific to particular sets of electronic chips (mainly on ATMEL microcontrollers and  PIC microchips using ANSI-C and Assembly language). And then the latter 5 years, was mostly spent on embedded software with customized operating systems (RTOS – Real Time Operating Systems) using mainly the programming languages  C, and C++.

Since my hardware development faculties are rusty, and because I had been in the most part of my short career, an embedded software developer, I would like to share some ideas I have , though little as they may be, on embedded engineering. And hope to contribute in the growth of some who may not have had the opportunity of training as I have had, but wish to learn in this profession. (Bear in mind though that I too am a work in progress. And this blog is written with the intention of continuous improvement.)

What is embedded engineering?  Embedded engineering is an applied science that is geared toward the creation of “embedded devices”.  Samples of embedded devices are mp3 players, mobile phones, a computer’s video card, ipod, iphones and well anything that basically acts like a personal computer but is not really one. With so many people creating a lot of sophisticated devices, and the laptop or notebook PC becoming smaller and smaller, sometimes an embedded device is misconstrued as a personal computer. But in this case, size really has no bearing. In my humble opinion (and in no way a universal truth), there are some important items that differentiate embedded equipment from that of its personal computer counterpart. And these are :

[1] An embedded device is an electronic device which possesses very limited resource compared to the PC.
- it usually has lesser “brain” or computing power.
- it has a smaller “football field” for memory.

[2] An embedded device is an electronic device which is usage-specific, it cannot compute or process generic PC functions.
- your watch can’t open nor write document files
- your cant do spread sheet computations on a scanner

[3] An embedded device may or may not stand on its own.
- the graphics card of your pc has no use if it is not “plugged in”.
- Well you know how it is with the PC.

Aside from these three, there are more finer points that differentiate embedded devices from the personal computer. These finer differences including the ones listed above play critical roles in the development process.

And like the other applied sciences, Embedded engineering too has so many sub-disciplines. However, in my humble opinion, its two fundamental sub-disciplines are HARDWARE  ENGINEERING (HW) and SOFTWARE ENGINEERING (SW).

HARDWARE EMBEDDED DEVELOPMENT

I may not do justice enough for hardware engineers for I lack the deep “gaussian-function-iiiish” level hardware expertise. But to give some balance on this article,  from a hardware standpoint, [2] stands out because HW engineers have to select the fittest electronic component in light of what needs to be achieved. Think of it as having to select a wall mounted TV, but you don’t really need a gigantic one because your living room wall space for the TV allows only a 40 inch mounting.  Is it wise then for you to buy a $2000 60-Inch flat panel high definition TV?  In the same light, hardware engineers select, re-select and re-select components to fit their designs. Choosing components with the lowest cost without curtailing safety, quality and functionality is top priority. Sometimes, due to our cost driven environment, HW engineers re-design according to cheaper and available components. Key point [2] indirectly governs the overall cost of the device. Which in turn has an effect on final market price or corporate profit.

In the plotting of schematics and creation of a prototype, a hardware guy has to ultimately decide whether his implementation is better than an already available sub-system pre-built by somebody else. In a simplistic analogy, hardware people have to ask “Do I create a finger print scanner or is it better to buy from somebody else?”. Of course, the thought process is not as simple as that.  But point [3] stands.  Some embedded devices like finger print scanners can not stand on their own, but they are created because somebody else will need them. (In hardware, there are times when one has to re-invent the wheel so as not to succumb to pricey components. The risk here is that development might take too much time and resource that the cost of Research and Development (R&D) will be bigger than originally planned.)

Unlike [2] and [3], point [1] is an effect of the general desire to make the product affordable or to increase corporate profit. Whichever the case, to achieve cost reduction,  hardware engineers are obligated to build a system with very limited resource but can, and will get the job done. (In my personal experience, sometimes a brilliant management team has to step in to stop R&D engineers from over-designing resulting to too much cost and too much loss in time-to-market.)

SOFTWARE and FIRMWARE

As I have said, I have for the most part of my short career become an embedded software developer. (More specifically in the development of customized applications, networking protocol implementations and partially in the realm of the operating systems.) In the technical world slang, most if not all embedded software developers are dubbed as firmware engineers(Ascher Opler-1967). Simply put, firmware or “microcode” is a kind of software which once installed, needs some form of special mechanism for it to be updated or modified. Some even are designed “NEVER” to be updated. Firmware codes or programs usually make up the barest essential of the system. Examples of devices that have microcode in them are your digital wrist watches, Engine Control Unit (ECU) and USB memory sticks.

However, with the advent of new technology which makes core-function upgrade relatively easy (FLASH Technology and UVPROMs ),  Ascher Opler’s “softness” based definition of firmware now needs some reconsideration. And since I am in no position to redefine it yet, I stumbled upon Mark Smotherman’s A Brief History of Microprogramming” which provides some clarity to what I think holds ground as to what firmware has evolved as of this time period.

And as for this blog-series however, I would like to refer to firmware as basically a piece of software that is intended for embedded devices.

Next …  (Embedded Software Paradigm )

free counters




Follow

Get every new post delivered to your Inbox.