Comment 40 for bug 2036239

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

1) Andre, after I switched to active-backup the issue is gone (so far). But yeah, we are looking for a reproducer as well. It's hard to narrow down some random issue - also likely for Intel.

2) But I just received an email from an Intel developer with a suggested change to the driver to narrow down the issue further. I quote ...

--- cut ---

Could you edit file (from kernel source tree base) drivers/net/ethernet/intel/ice/ice_lag.c .
Then find the functions ice_init_lag()and ice_deinit_lag().

Then add this line to the beginning of the functions

return 0; and return; respectively.

the patch nomenclature would look something like this:

* Memory will be freed in ice_deinit_lag
*/
int ice_init_lag(struct ice_pf *pf)
{
        struct device *dev = ice_pf_to_dev(pf);
        struct ice_lag *lag;
        struct ice_vsi *vsi;
        int err;

+ return 0;
        pf->lag = kzalloc(sizeof(*lag), GFP_KERNEL);
        if (!pf->lag)
                return -ENOMEM;
        lag = pf->lag;

………

* This function is meant to only be called on driver remove/shutdown
*/
void ice_deinit_lag(struct ice_pf *pf)
{
        struct ice_lag *lag;

+ return;
        lag = pf->lag;

Then re-build the driver and try to reproduce the problem?

--- cut ---

So in essence I believe this just skips offloading the bonding / LACP to the HW.
I will set this up on one or two of our machines to test. Would you please also try this on your systems?