octavia/ovn: missed healthmon port cleanup

Bug #2062965 reported by Kurt Garloff
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Fernando Royo

Bug Description

Creating an octavia load-balancer with the ovn provider, adding a health-monitor and then members, octavia creates a neutron hm port in each subnet where a member was added.
Removing the members again, the hm ports do not get cleaned up. The hm removal then cleans up one of the hm ports, the one that is in the subnet where the vip happens to be. The others are still left and do not get cleaned up by octavia. This of course will cause issues when subnets can later not be deleted due to being still populated by the orphaned ports.
The cleanup logic simply does not match the hm port creation logic.

Mitigating factors:
* openstack loadbalancer delete --cascade does clean up all hm ports.
* Deleting the health mon before removing the members also avoids the issue.

Revision history for this message
Kurt Garloff (kgarloff) wrote :

Test script to reproduce the issue.
Can be run against any OpenStack environment with octavia ovn provider loadbalancers without any special privileges.
But was originally observed against OpenStack 2023.2 (Bobcat) as configured by OSISM which uses kolla-ansible.
Original report is here:
https://github.com/osism/issues/issues/921

description: updated
description: updated
Revision history for this message
Kurt Garloff (kgarloff) wrote :

Here's a patch against the ovn-octavia helper that addressed this.
What it does:
* Whenever a port is deleted and a health-mon is active, it does check whether this was the last port in this subnet.
* If so, the neutron hm port does get deleted.

This fixes the test case.

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

moving to neutron, the ovn-octavia-provider is a neutron project

affects: octavia → neutron
tags: added: ovn-octavia-provider
Revision history for this message
Fernando Royo (froyoredhat) wrote :

The wrong condition checking here on member_delete [1] about pool == OFFLINE is the condition breaking the logic, because other members from diff subnet could keep the pool != OFFLINE and if no more member are there from subnetX, the hm_port for subnetX will be a leftover.

The patch provider in c#2 is wrong, it looks fixing the test case because is calling to _clean_up_hm_port everytime a member is deleted, and _clean_up_hm_port is taking care of delete or not according to the pending members.

I will provide a patch (mostly similar to your proposal) in a few moments to fix this corner case.

Kurt, for next time, send your patch upstream and we'll work together on that patch, it would be a pleasure to have more people contributing to the project ;)

[1] https://opendev.org/openstack/ovn-octavia-provider/src/branch/master/ovn_octavia_provider/helper.py#L2200

Changed in neutron:
assignee: nobody → Fernando Royo (froyoredhat)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (master)
Changed in neutron:
status: New → In Progress
Revision history for this message
Kurt Garloff (kgarloff) wrote :

Thanks, Fernando, I will carefully read your patch to spot the difference.
I would really appreciate to see the fix also in caracal and bobcat to make octavia/ovn useful there.
When you say "send your patch upstream next time": You prefer mails to openstack-dev rather than bug reports with reproduction scripts and patches?

Revision history for this message
Fernando Royo (froyoredhat) wrote :

You're welcome Kurt, once merged into master, we can backport to stable branches. Regarding my comment, I meant that you can propose your fixs/improvements to the code just like I sent the patch upstream yesterday, basically with a new patch to the repository via https://review.opendev.org/.

And of course, reporting bugs here on Launchpad, and if they come with a reproducer script, it's also very helpful.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/916637
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/f034bab144b68cf96c538339e389c4cc7c6d7d63
Submitter: "Zuul (22348)"
Branch: master

commit f034bab144b68cf96c538339e389c4cc7c6d7d63
Author: Fernando Royo <email address hidden>
Date: Mon Apr 22 15:47:46 2024 +0200

    Remove leftover OVN LB HM port upon deletion of a member

    When a load balancer pool has a Health Monitor associated with it,
    an OVN LB Health Monitor port is created for each backend member
    subnet added.

    When removing backend members, the OVN LB Health Monitor port is
    cleaned up only if no more members are associated with the Health
    Monitor pool. However, this assumption is incorrect. This patch
    corrects this behavior by checking instead if there are more members
    from the same subnet associated with the pool. It ensures that the
    OVN LB Health Monitor port is deleted only when the last member from
    the subnet is deleted. If the port is being used by another different
    LB Health Monitor, `_clean_up_hm_port` will handle it.

    Closes-Bug: #2062965
    Change-Id: I4c35cc5c6af14bb208f4313bb86e3519df0a30fa

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/917724

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/917725

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/917726

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.