Mottek |
|
Navigation» 40Hz Mottek Blog» Latest » Archive 2011 2010 2009 2008 2007 2006 2005 2004 2003 Search |
Debugging and the Scientific MethodOnce a program reaches a stable state in terms of architecture, design and functionality, debugging becomes the predominant activity in the development process (unless the architecture or design were flawed to begin with). The largest part of the debugging process is to find the root cause for errors, whereas the actual fixing is usually a much simpler activity: Many bugs are fixed with one line changes, or even one character changes. Verifying AssumptionsMy approach to understanding why a program fails is to identify a set of hypotheses, and then test one after another. There are a couple of important points in this process: Testing can be done via a various mechanisms, such as debugging, logging, modifying input data, modifying program code, modifying timings, or changing the execution environment. Each test should produce unambiguous results, and sequential verification is important in order to allow interpretation of those results. The hypotheses need to be simple enough and non-overlapping to reduce the set of possible root causes reliably, trying to verify too much at the same time can be quite difficult. Reducing the Debugging WorkThe actual verification process (implementing and running the tests) is usually simple but time consuming, which means that I try to automate as much as possible. But the true way to reduce debugging work is to identify the right set of hypotheses, and this is unfortunately also the toughest part of debugging. It is usually a mix of domain knowledge, intuition and a systematic approach which pay off. Domain knowledge can unfortunately only be gained by actually working with the program code, it is very rare that a problem is generic enough that it can be solved without deeper understanding about the code. Intuition however is another key component altogether: It allows to make assumptions without the deepest possible knowledge, rather, deliberate ignorance helps to prevent digging in too deep (after all, understanding a system down to the metal takes a long time), and produce hypotheses e.g. based on previous experience faster. There is of course the risk that such intuition based hypotheses can lead nowhere, and I have had many many situations where I declared victory too early only. Humility is therefore a very important parter to intuition! Why Debuggers are BadFormulating a hypothesis why a program is broken requires thinking and knowledge. Using a debugger usually increases the knowledge about the inner workings of a program, but it can easily stand in the way of the thinking part when coming up with ideas about error root causes. Debuggers are good to verify mini-hypotheses when stepping through code, but they are very cumbersome at getting to the big picture and testing more complex problems (plus they are worthless for testing timing issues). My impression is that they are a popular tool because they tend to give immediate results on these mini-hypotheses, which gives the illusion of making progress. For me, far more important is to read code, work with a peer, and use tracing and logging to understand the problem. Deliberate BreakageOne of my favourite approaches to produce hypotheses is to take the program in the two states of working and broken, and reduce the difference in implementation between them, until the smallest difference is identified as the root cause. In cases where a bug is the result of changes due to ongoing implementation work, this is very simple, git for example allows to bisect a series of changes and mark working ones and broken ones until the offending change is found. This approach can be extended by looking at different definitions for working and broken. As an example, working does not necessarily mean that it has to apply to the same program which is broken, but it can also mean a completely different program which however shares some design or implementation with the broken program. In relation to Blunt, I have been able to test some assumptions about the packet flow by checking Blunt 1 versus Blunt 2, even though both are internally very different. It is important to keep a very open mind when looking for these instances of a working system, having them is the only way to provide a solid foundation for further debugging, and to keep sanity. If everything is broken, it is a pretty rocky road to make any progress. It is not impossible though, and in those cases, I usually start removing parts of the broken program until at least something works, and then work backwards by adding code. Now, how about Blunt?All of the above also applies to reverse engineering, which is still the biggest chunk of work to get Blunt running. Reverse engineering is in some sense simpler than debugging since the code is known to work, and understanding it is achieved by disassembly, or black box testing techniques to verify assumptions about how the code works. I noticed that persistence and thoroughness really pays off in this area, and it seems that Blunt is very near to finally work as designed :) Stay tuned for more! Dear TCommToolWhat is it you really want? - I've been experimenting with Blunt 2, setting up an internet connection using NIE and PPP over Bluetooth, and overall it appears to be quite stable, with the only very annoying problem that the data is not handed back from Blunt to the NewtonScript layer (and thus NIE). The problem appears to be the inner working of the TCommTool class, which handles this interaction. What is not entirely clear right now is how to indicate that data has arrived for further processing. This results in apparent packet loss on the NIE level, and unnecessary packet retransmission or aborted connections. It might be the most effective to debug the data flow, but the drawback of using Hammer is that it needs USB to serial drivers installed under Mac OS Classic. Well, I've gotten very far without Hammer, maybe now it's again Hammer time! More reverse engineering resultsI've been working with the NewtonOS sampler program a bit more, and started to work a bit more on the inter-task communication. The more detailed results are on the NewtonOS Internals pages, so far one important finding is that NewtonOS is very lightweight what comes to memory management, and relies on a flat memory model with very little access restrictions. This means that memory must be allocated and deallocated with slightly different strategies than in a more conventional OS, specifically, data is shared more openly between tasks and must be kept valid as long as any task might be using it - the kernel does not create own copies of data. Technical Information on the NewtonOSMy "NewtonOS Sampler" is now up on github, so far it's just a skeleton though. I'll be using the Newton DDK to implement examples of the OS services, additionally, Walter Smith and Paul Guyot have a collection of very good documents on the OS itself. Getting to the bottom of itThe latest changes to Blunt 2 fixed the crash when sending a large number of data blocks, which was caused by memory management issues around the TUPort class used for communication between the tasks in Blunt. More specifically, the allocation and deallocation of memory for the messages was causing problems. I still need to find the right approach though, and it is probably best done via a small test program which uses tasks, shared memory, messages and ports. Stabilizing Blunt 2I've gotten Blunt 2 quite a bit more stable, and have also added missing functionality to send larger data packets in smaller chunks. I'm testing at the moment with a Conceptronic CBT100C and a PICO card, which both seem to be working quite well with 230kbps as the serial speed. Github continues to be very useful to manage the work and track issues (I should probably indicate in the git commits also the ID of any issue I'm fixing), and for those interested, the git repository actually contains the current working stack as an installable package in the "Bluetooth Setup.zip" file ;) One remaining stability issue is long term stability though, which I think is related to either improper memory management for the messages sent between the server and the other parts of the system, or problems with the interrupt handlers being paged out. Blunt 2 statusHere's a quick update on Blunt 2. I've moved the code to Github and started to track issues there as well. My development environment is TextMate, and I use MPW just for compiling. The simple serial line based logging works reasonably well, and I've not really missed using Hammer as a debugger (Hammer is still essential for reverse engineering by executing code, for simple code analysis, I use my online database of the NewtonOS ROM). Most problems in Blunt 2 are not in the actual Bluetooth stack, but around interfacing with the Bluetooth hardware: PC Cards are using a simple 16450 UART, and sending and receiving data is not trivial due to the lack of documentation for the serial chip interface on the Newton. I've switched to simple non-interrupt based sending of data, which means some performance penalty but should get around problems I've seen with sending bulk data. The other problematic interface is between the Bluetooth stack and the CommTool, which in turn manages the interface to the NewtonScript world. CommTools are even less documented unfortunately, so I am still lacking asynchronous sending functionality and means to accept connections instead of initiating them. Progress is expectedly slow, on the other hand, it is nice to work with the probably best notebook Apple ever made, my trusty old Pismo :) Picking up the GauntletAfter lengthy GTD experiments (more on that later), programming a bit for the iPhone and bringing development tools for the Newton into the 21st century, I'm thinking it's time to revisit Blunt 2. I have a simplified development setup where I need to use Mac OS 9 only for compiling, and can use e.g. TextMate for editing, a simple serial line for debug output and RDCL for package installation. I uploaded the code as a first step to github, which should allow much better tracking of changes and experiments. Newton ROM Cross ReferencerI moved my tool to cross reference the Newton ROM now to 40Hz.org. It's usage is quite simple, just enter the function name you want to view into the text field (use % as a wild card character) and press "View" - the resulting listing is hyper linked to allow further digging into the ROM. Experimenting with iPhone DevelopmentAs a programmer, I'm trying to continuously learn new technologies, and playing around with iPhone development was inevitable. One aspect which motivated me goes actually back to the NeXT era, and that is Objective-C. Back in the days, we just got started with a friend and his small company to dive into C++ since it was the logical next step from C, but Objective-C and Brad Cox' idea of Software-ICs seemed already then very appealing. It was unfortunately limited on the commercial side to NeXTStep and later OpenStep, whereas C++ was much easier to deploy. A while ago I started therefore to experiment with iPhone app development to see what the fuzz is about. Using Objective-C is very refreshing, it is great to have a compiled but also very dynamic language. It won't beat NewtonScript though ;). The result of my experiments is now available in the iTunes app store: a simple, free expense tracking software for personal use called Geld, including DropBox support. Enjoy! |