This article provides a comprehensive overview of dependability in software systems, focusing on terminology and key concepts. Here's a breakdown of the key points:


Dependability:

  • Refers to the user's confidence in a software system to deliver the expected service reliably.
  • Failures occur when the actual service deviates from the specified desired service.
  • Errors in the system state can lead to failures.
  • Faults are the root cause of errors, arising from programmer mistakes, design flaws, or misunderstanding of requirements.


Approaches to Dependability:

  • Fault Avoidance: Techniques employed to prevent certain types of faults from being introduced in the first place. Examples include using languages with built-in checks like array bounds checking in Java or employing rigorous coding standards.
  • Fault Tolerance: Designing systems with redundant components or mechanisms to handle failures gracefully, ensuring continued operation even if certain components fail. This is crucial for critical systems like airplanes.
  • Error Removal: Applying verification techniques to identify and remove latent errors from the code. This is where software testing plays a vital role.
  • Error Forecasting: Estimating the likelihood of failures based on the program's behavior and testing results. This helps assess the overall reliability of the system.



Dependability Metrics:

  • Reliability: Measures the continuity of correct service, indicating how long a system can operate without failures.
  • Availability: Measures the readiness of the software to respond to user requests, indicating the percentage of time the system is functional.
  • Safety: Focuses on the absence of catastrophic consequences due to software failures.
  • Integrity: Ensures the system remains unaltered and protected from unauthorized access.
  • Maintainability: Ability to modify and repair the software to adapt to changing requirements and ensure long-term functionality.


Critical Systems and Robustness:

  • Critical systems require meticulous planning for failure scenarios, and anticipating potential software, hardware, and environmental issues.
  • Robustness refers to a system's ability to function correctly even under unexpected or adverse conditions.
  • Testing for critical systems needs to incorporate stress tests and simulated failures to ensure the system can handle such situations effectively.


DO-178C and Software Criticality Levels in Avionics Systems

So, the DO-178C standard, formally titled "Software Considerations in Airborne Systems and Equipment Certification," provides guidelines for developing and certifying software used in avionics systems. A key aspect of this standard is classifying software based on its criticality, which determines the rigor of the development and verification processes required.


DO-178C defines five criticality levels:

  • Catastrophic: Failure of the software could lead to a crash or severe injury.
  • Hazardous: Failure could significantly impact safety or performance, making it difficult for the crew to operate the aircraft.
  • Major: Failure would increase crew workload or cause passenger discomfort.
  • Minor: Failure would be noticeable but have minimal impact on safety or operation.
  • No Effect: Failure would have no impact on the aircraft or crew.

The criticality level of a software component determines the following:

  • Development processes: Higher criticality levels require more stringent development processes, such as formal methods and rigorous code reviews.
  • Verification and testing: More extensive testing and analysis are required for higher criticality levels.
  • Documentation: More detailed documentation is required for higher criticality levels.

The criticality level of the software determines the rigor of the testing process. Software with a higher criticality level requires more robust testing and verification. The goal is to ensure that the software is safe and reliable and that it can meet the specific requirements of the avionics system.


Key Takeaways:

  • Dependability is a crucial aspect of software development, especially for critical systems.
  • A multi-pronged approach involving fault avoidance, tolerance, removal, and forecasting is essential for building dependable software.
  • Testing plays a vital role in error removal and ensuring system reliability.
  • Critical systems require robust design and testing to handle potential failures and maintain safety.


References:

Dependability Definitions | Coursera. (n.d.). Coursera. https://www.coursera.org/learn/introduction-software-testing/lecture/mtx5o/dependability-definitions

Laprie, J.C. Dependable Computing and Fault Tolerance: Concepts and Terminology, Digest of FTCS 15, pages 2-11, 1985

DO-178C: Software Considerations in Airborne Systems and Equipment Certification, RTCA, January 2012