libceph: don't allow bidirectional swap of pg-upmap-items
authorIlya Dryomov <idryomov@gmail.com>
Mon, 18 Sep 2017 10:21:37 +0000 (12:21 +0200)
committerIlya Dryomov <idryomov@gmail.com>
Tue, 19 Sep 2017 18:34:29 +0000 (20:34 +0200)
commit29a0cfbf91ba997591535a4f7246835ce8328141
tree3ead87aa6f406f5c6ebbb8205fdb5cbb87327775
parent2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e
libceph: don't allow bidirectional swap of pg-upmap-items

This reverts most of commit f53b7665c8ce ("libceph: upmap semantic
changes").

We need to prevent duplicates in the final result.  For example, we
can currently take

  [1,2,3] and apply [(1,2)] and get [2,2,3]

or

  [1,2,3] and apply [(3,2)] and get [1,2,2]

The rest of the system is not prepared to handle duplicates in the
result set like this.

The reverted piece was intended to allow

  [1,2,3] and [(1,2),(2,1)] to get [2,1,3]

to reorder primaries.  First, this bidirectional swap is hard to
implement in a way that also prevents dups.  For example, [1,2,3] and
[(1,4),(2,3),(3,4)] would give [4,3,4] but would we just drop the last
step we'd have [4,3,3] which is also invalid, etc.  Simpler to just not
handle bidirectional swaps.  In practice, they are not needed: if you
just want to choose a different primary then use primary_affinity, or
pg_upmap (not pg_upmap_items).

Cc: stable@vger.kernel.org # 4.13
Link: http://tracker.ceph.com/issues/21410
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
net/ceph/osdmap.c