1) Andre, after I switched to active-backup the issue is gone (so far). But yeah, we are looking for a reproducer as well. It's hard to narrow down some random issue - also likely for Intel.
2) But I just received an email from an Intel developer with a suggested change to the driver to narrow down the issue further. I quote ...
--- cut ---
Could you edit file (from kernel source tree base) drivers/net/ethernet/intel/ice/ice_lag.c .
Then find the functions ice_init_lag()and ice_deinit_lag().
Then add this line to the beginning of the functions
return 0; and return; respectively.
the patch nomenclature would look something like this:
* Memory will be freed in ice_deinit_lag
*/
int ice_init_lag(struct ice_pf *pf)
{
struct device *dev = ice_pf_to_dev(pf);
struct ice_lag *lag;
struct ice_vsi *vsi;
int err;
+ return 0;
pf->lag = kzalloc(sizeof(*lag), GFP_KERNEL);
if (!pf->lag) return -ENOMEM;
lag = pf->lag;
………
* This function is meant to only be called on driver remove/shutdown
*/
void ice_deinit_lag(struct ice_pf *pf)
{
struct ice_lag *lag;
+ return;
lag = pf->lag;
Then re-build the driver and try to reproduce the problem?
--- cut ---
So in essence I believe this just skips offloading the bonding / LACP to the HW.
I will set this up on one or two of our machines to test. Would you please also try this on your systems?
1) Andre, after I switched to active-backup the issue is gone (so far). But yeah, we are looking for a reproducer as well. It's hard to narrow down some random issue - also likely for Intel.
2) But I just received an email from an Intel developer with a suggested change to the driver to narrow down the issue further. I quote ...
--- cut ---
Could you edit file (from kernel source tree base) drivers/ net/ethernet/ intel/ice/ ice_lag. c .
Then find the functions ice_init_lag()and ice_deinit_lag().
Then add this line to the beginning of the functions
return 0; and return; respectively.
the patch nomenclature would look something like this:
* Memory will be freed in ice_deinit_lag
*/
int ice_init_lag(struct ice_pf *pf)
{
struct device *dev = ice_pf_to_dev(pf);
struct ice_lag *lag;
struct ice_vsi *vsi;
int err;
+ return 0; sizeof( *lag), GFP_KERNEL);
return -ENOMEM;
pf->lag = kzalloc(
if (!pf->lag)
lag = pf->lag;
………
* This function is meant to only be called on driver remove/shutdown lag(struct ice_pf *pf)
*/
void ice_deinit_
{
struct ice_lag *lag;
+ return;
lag = pf->lag;
Then re-build the driver and try to reproduce the problem?
--- cut ---
So in essence I believe this just skips offloading the bonding / LACP to the HW.
I will set this up on one or two of our machines to test. Would you please also try this on your systems?