Don’t let a File System let you down
It’s a known fact that the embedded world has always trailed behind the mainstream-computing world especially in the x86 market. This sometimes is a disadvantage in terms of the performance of CPU and available technology etc. but equally can be an advantage in that problems often already have a solution to them by the time they hit the embedded market. This however is only the case when a problem is seen as a “real” problem. The mainstream world has lived with a problem that has been there since its outset and therefore as such has been accepted as a fact rather than a problem. The problem is the File System. PC users seem to accept the fact that they have to shutdown their system in a controlled manner and might lose data if their power fails. Often to recover they will run a check disk utility to restore the file structure but because this is a timely process it often gets bypassed.
The more mainstream technology is used in embedded devices, the more this will become a “real” problem for embedded engineers because it is not likely to be addressed in the mainstream. To address the problem the structure of the current system has to be moved away from and backward compatibility is lost.
Power Cycle Testing
The current problem is that failures do not exhibit themselves in the development phase of most projects or even the testing phase. Nobody wants to take on extra cost at the beginning of a project if it can be avoided and it is a problem that can be ignored at this stage if a short-term view of the project is taken. In reality it is going to cost more in the long-term.
Even if some sort of power failure testing is done, catching a failure is difficult because of the millions of operations that are performed within a very short timescale.
These failures exhibit themselves as a major headache by occurring randomly after the product has left the factory. This might mean a visit by an engineer to swap out the unit. Even when hardware is returned for testing and reloaded with the software it might seem to function correctly. This will not only cost in terms of the return but with loss of confidence from the customer. Datalight have identified that up to 80-90% of “so called” hardware failures actually turn out to be failings in the software.
How is data secured?
Reliance is a transactional file system with a driver for WindowsXP so data can be transferred to a PC from the embedded device. In brief, a transactional file system is a file system with atomic writes. This basically means a file is either committed completely to media or has failed to be committed and therefore effectively has not happened.
The need for a transactional file system is one of those things that will become more apparent as the embedded world uses more and more mainstream technology. There is a growing trend in the market for people to use off-the-shelf modules and move away from the 8-bit and 16-bit processors to 32-bit processors. This in turn means an operating system that has a file system is normally required.
Reliance is a slot in replacement or will even work alongside the current File System. Its modular nature also allows movement from CPU and Storage Media and even Operating System without having to relearn another File Structure API.
Free Testing Tools
Datalight will offer free of charge a utility that allows testing for failures on current File Systems. The company has identified the most likely points of failure for a File System and the utility homes in on these areas. This is typically within the block device driver during certain key operations. The three main ones “Changing an Attribute”, “Renaming a File” and “Deleting a File” all have various degrees of severity on the file structure. It should be stressed however that these are not the only three points of failure.
Some people might be thinking at this point they will avoid these key operations as much as possible. This actually is making the problem worse because it is making the event happen less often but not actually preventing it, and actually making problems harder to find when it does actually fail.
The test utility will analyse the number of block device accesses that are made during an operation. Then it allows the stopping of the system during any one of these phases and gives the chance to power cycle the system. Critically this can be done with a hardware switch rather than a soft reboot. Then the utility allows recovery to check the damage that was done to the File System.
If there are failures a Reliance evaluation version can be downloaded to reproduce exactly the same test scenario to compare the results. What should be discovered is that under Reliance longer start times are never experienced due to system checking and that there are no lost or cross linked files or, worst case, a completely unbootable system.
A download of the test tool is available at: www.hitex.co.uk



