Contact Us
Contact Us

"*" indicates required fields

Resilient Networks Essential for Next-Generation Railways

By Maria Stahley October 11, 2023 8 MIN

Railroad safety is increasingly dependent on resilient communications networks.

Local access networks are essential for system operators to relay signals, exchange real-time emergency communications, and otherwise disseminate operational information. Reliable, efficient, and high-capacity broadband is particularly crucial for operating Positive Train Control (PTC), Communications Based Train Control (CBTC), and other network-based rail safety systems.

Keeping these mission-critical systems online will require greater focus on the resilience of broadband networks. Despite the transportation sector’s growing dependence on uninterrupted broadband communications, many rail operators’ fiber-optic networks remain relatively unchanged since initial cable networks were first installed along the country’s rail right-of-way — some 40 years ago.

Graphic: Potential Causes of Communications Failure: System, Physical and Network Failures

Common Causes of Communications Failure

Rail operators’ broadband networks are generally reliable, providing non-stop digital and voice communications systems. Outages most commonly occur when a rail operator has a single line of service from a single service provider. Disruption anywhere throughout the communications system can result in a complete system shutdown. Vulnerabilities may include environmental or self-inflicted damages from construction activities, extreme weather events, or vehicle accidents. Software glitches or configuration errors can occur, but these are far less common.

Though wayside equipment is designed to perform within harsh environmental conditions, exposure to snow, ice, heat and humidity can damage wired infrastructure over time — especially when hardware systems are not properly tested and maintained.

Environmental incidents frequently involve something as simple as a fallen tree branch or wildlife cutting into aerial cables. For example, a Western New York telecommunications company recently replaced 87 miles of cable due to squirrel damage.

Prolonged power losses can seriously cripple communications networks. Without sufficient back-up power, an outage that affects terminal facilities or central control rooms, for instance, could result in system-wide disruption.

Graphic: Communication System Outages 178 per year affecting at least 900,000 minutes of user calls or critical services

Source: Federal Communications Commission data
(2007-2014)

Network Asset Management Plans

Enterprise-wide asset management plans are useful tools to assess common points of failure.

In collaboration with service providers, rail operators can uncover network risks through asset inventories of their telecommunications infrastructure. Three common approaches are:

  • Logical maps provide a high-level overview of connections between equipment in the field and systems within the operator’s facilities. These show how rail facilities connect to the telecommunications central office, point-of-presence (POP) locations, and logical circuit links between these sites.
  • Physical maps display detailed connection routes for each connection path going back to the central facilities or to POP locations. Maps can show equipment locations and detail any redundant service loops.
  • Software asset management (SAM) optimize the maintenance, utilization and deployment of software applications. This will provide a holistic view of the assets’ lifecycle and the ability to analyze data specific to each asset.

By gaining a better understanding of network pathways, operators are better positioned to analyze network diversity and pinpoint areas of vulnerability.

One common point of failure is the reliance on an individual service line. Asset management plans can help to identify whether a second fiber-optic pathway would diversify the delivery of broadband services. Even if rail operators procure redundant services from separate carriers, however, the operator could still be vulnerable to downtime if both carriers use the same local loop connection.

Graphic: Construction activities mistakenly cut fiber-optic cables, on average, once every day

Source: The Fiber Optic Association

Three Keys to Resilience

When enhancing communications system resilience, rail operators must look for measures that provide long-term protection. Strengthening cable jackets, for example, may only provide a temporary fix. Effective enhancements involve addressing system redundancy, route diversity, and restorability.

Redundancy

Rail safety systems are designed with back-up communications protocols to ensure operators can overcome congestion, capacity constraints, or signal interference obstacles. Likewise, redundancy measures must be incorporated throughout communications infrastructure designs. Otherwise, if equipment failure occurs, communications will remain impaired until the equipment can be replaced or fixed.

Redundant equipment generally come in two forms:

  • Hot standby backup systems remain constantly running and synchronized with the primary system. In the event of a failure, hot standby systems transition immediately and seamlessly, without any interruption in service.
  • Cold standby stay offline until a failure event occurs. System operators must manually activate such a device. One example is the use of onboard radio communications in situations when wireless systems go down.

To prevent risks of communications failure, railroad operators can expand infrastructure systems by installing one more unit or component than necessary for normal operations (N+1). N+1 redundancy measures include:

  • Vertical Scalability (Scaling up) adds hardware to existing systems to meet demand, increasing response times with no requirements in synchronized nodes by spreading load between CPU and RAM resources.
  • Horizontal (Scaling out) expands system nodes, allowing agencies to meet load demands, increase the number of available resources, and reduce risk of downtime.

Graphic: Redundancy - Additional or duplicate communications assets share the load or provide back-up to the primary asset

Photo of circuit board

A single circuit board power issue can cause widespread delay of rail services

Route Diversity

Rail fiber-optic cable networks typically rely upon point-to-point network designs, a system architecture of two directly connected devices. These dedicated networks are fast, efficient and reliable. Disruption along one route can, however, disrupt the entire network. At minimum, the more resilient solution would be to install a secondary, parallel route.

Diverse routes provide identical information through unique points of entry and exit, traveling along separate cabling paths. Ideally, the paths have significant separation and — even better — pass through multiple command rooms.

Alternatively, a ring network transmits data in multiple directions. If disruption occurs along any segment of the ring, the transmitted data can travel in the opposite direction. A ring network is generally more complex to design and costly to install. Obtaining real estate to run additional cable routes can often prove difficult.

Graphic: Route Diversity - Signal travels between two points, over more than one physical path, with no common points

Restorability

When network devices fail, a technician must travel to wherever the issue can be resolved, significantly delaying service repair and affecting train schedules. Self-healing communications networks can automatically detect and recover from network failures, potentially minimizing downtime. Monitoring software identifies the disruption and reroutes traffic along a redundant pathway. Multiple network nodes share the load and provide various paths for transmitting data and voice communications.

Self-healing networks come with greater cost and complexity. Network systems may introduce latency, reducing overall network performance.

Graphic: Restorative Measures - Self-healing maintenance enable rapid restoration if services are lost or congested

Coming Soon: Guidance on Rail Communications

The mission-critical importance of maintaining fail-safe communications systems is forcing railway operators to rethink how they can achieve resilience outside of normal operating conditions. Following the fail-safe design philosophy, comprehensive approaches must overcome communications systems vulnerabilities during adverse environmental conditions as well as throughout incidents of system, physical or network failures.

The American Railway Engineering and Maintenance-of-Way Association (AREMA) recently launched Technical Committee 35 – Information and Technology to develop best practices on fail-safe communications networks, among other safety-critical systems. Upcoming guidance will include recommendations for assessing network resilience, prioritizing communications infrastructure improvements, and deploying next-generation applications that can improve the safety, efficiency, reliability and quality of railway operations.

About The Author

Maria Stahley

Railroad & Transit Project Manager