Analysis of the problem of EtherCAT full network Init

 



Executive Summary

Occasionally, the on-site equipment may have problems during production.EtherCATIn the case of a complete network initialization failure, a power cycle or reactivation of the configuration is required to recover. How can you diagnose which slave has caused the network crash and what the possible causes are? This article describes:

1.   EtherCATReason for automatic return of Init by the master station and parameter settings

2.   View the number of times the slave connection is lost

3.   Possible reasons for losing connection from the station

Keywords:

EtherCATInit of the main station, continuous frame loss, network interruption


Historical Versions

This article references the existing technical document "EtherCATTroubleshooting method for network interruption caused by continuous frame loss.docx", and added some content to the body to enable users to read as much information as possible in one document. The methods described in the article were guided by ETG technical experts, and are hereby collated and released for user engineers to collect and forward.

Text 

1      EtherCATReason for the master station returning Init

Yes, hereTwinCATIn the settings, there is only one possibility that the master station will enter Init by itself - the data packet sent out cannot be returned for 10 consecutive times. According toTwinCATThe default configuration of the master station will cause the master station to enter Init and no longer attempt to re-enter OP. Restarting or activating the configuration is required to re-enter OP. If this is the cause, it is usuallyTwinCATThe log of shows the message that packet loss occurred 10 times.

1.1      The main stationReInitSettings.


Note: This setting is for security purposes, as in the event that the outgoing data packet cannot be returned, the master station cannot determine whether the data output to all slave stations has actually been accepted, nor can it receive any data sent by the slave stations. In this situation, it is safer to enter Init manually, wait for human intervention, troubleshoot the issue, and then restart. Only cancel the master station for troubleshooting purposes.ReInitThis option allows the master station to maintain the OP status and continue to send data packets even if the data packets cannot be retrieved. However, this has potential risks, so it is not recommended for normal production. By the way, some third-party master stations do not report this error, which may not necessarily be because their master stations are more stable, but may be because they do not have this protection function.

1.2 From the stationReInitSettings.

If a flash interrupt occurs in an individual slave station, resulting in the failure of data packets to return, and the master station continues to send data packets later, if the slave station reconnects, whether it recovers communication depends on the slave station'sReInitSettings:



The default is also checked. If you cancel, even if the master station sends data again, the slave station will not resume. This of course allows you to immediately find out which slave station caused the packet loss. In order to make the system stronger and overcome short-term (such as less than 10 times) interruptions, it is usually recommended to use the default configuration, which reinitializes the OP communication automatically after the failure point is restored. 2 How to view which slave station has been disconnected

There are two ways to check which slave station has interrupted the connection: one is toEtherCAT The Online interface uses a PLC program to read the Register of the slave station.

2.1 InEtherCAT Online interface displays the number of times the slave connection is lost

EtherCATFrom the station, it can automatically record the number of connection losses for each port. In the Online View of the master station, select 0310 'Link Lost A/B'. Because the servo, stepper, or EL module only has AB ports, the low byte of 0310 indicates the number of connection losses for Port A, the entrance, and the high byte indicates the number of connection losses for Port B, the exit.Port BThat is, the number of exports.



Then you can see the value of Reg:0310 on the Online page:



The current value of Reg:0310 for address 1008 Term9(EK1100) in the above image is 5, which is due to 5 times of plugging and unplugging the network cable during testing. There are also several points to note:

(1) Unplugging the network cable only affects the slave port A

If you unplug the network cable from the EK1100, only the low byte of 0310 of the EK1100 itself will be incremented, while the subsequent EL modules display normally. If you unplug the network cable from the EK1110 connected to it, it will still display as if the low byte of 0310 of the EK1100 was incremented, while the unplugged EK1110 displays normally.

(2) The Reg word from the station is only reset when power is lost

The test results on the CX5130+TC3 platform show that the EK1100 slave station power-down restart does not cause the value to increase, as the value is reset each time the power is turned off. However, the TC3 master station restarts or the TC3 development environment is closed and then reopened, the Reg:0310 value remains unchanged.

(3) Diagnostic data can be exported as a .csv file

Although direct screenshots can also be analyzed, if there are many websites, multiple screenshots are needed.EtherCAT The Online interface provides an export function. Select Export List from the right-click menu in the image above to export diagnostic information to a .csv file. Open the .csv file in Excel, and then select "Split Column" under the "Data" main menu in Excel. Set the delimiter to "semicolon", and then you can see the data for each column:


2.2 UseFB_EcPhysicalReadCmdRead it

If there is no engineer on site to check with a computerEtherCAT OnlineInterface, which can be used to write PLC programs to call function blocksFB_EcPhysicalReadCmdTo read the Register of the slave station, for example:



Because the registered word cannot be configured toProcess Data, so you can only read it using FB. Don't read it every cycle, as it will consume too much CPU andEtherCATResources, suggestions inWcStateWhen it is invalid, trigger the read instruction. If the number of slave stations exceeds a certain limit, it is best to read in groups.

2.3 Watch the Run light from the station

If there is a slave station with data communication, the connection light and running light will be on. However, if there is no data communication, the Run light will not be on. However, this method is difficult to check one by one if the electrical cabinets are scattered on site. Moreover, if the master station stops communicating after 10 packet losses, or the problematic slave station recovers communication within 10 times, then the human operator will hardly see the change in the indicator light. If the master/slave stations are not set to automaticallyReInitThe problematic slave indicator will be different from other normal slaves.

3 WhyEtherCATConnection will be lost

Possible reasons include: EMC interference, network cable quality issues, virtual connection of network ports, loose network interfaces, poor contact of slip rings, and self-damage of slave modules. The most common are network cable quality and network port connection issues, which should be investigated first.

3.1 Network cable quality

If it is a prefabricated network cable, it is usually produced in bulk by a machine, and the quality of the crimping will be more stable. If it is handmade, there are variables such as the length of the reserved wire, the force used, and the contact between the shielding layer and the metal sheet. You can try opening several connectors and inspecting them.

3.2 Network port connection

You can try to gently shake the network cable of a station with your hand while monitoringEtherCAT Online, check whether there is an increase in Lost Frame. If there is an increase, it indicates that vibration during the machine's production process may have caused packet loss. If it is determined that this node is the cause of the failure, it is necessary to check the RJ45 connectors, RJ45 network ports for looseness, damage, corrosion, etc. If conditions permit, the simplest solution is, of course, to replace the components.