Segmenting Object Affordances:
Reproducibility and Sensitivity to Scale

T. Apicella^1,2, A. Xompero², P. Gastaldo¹, A. Cavallaro^3,4

¹University of Genoa, Italy; ²Queen Mary University of London, United Kingdom;
³Idiap Research Institute, Switzerland; ⁴Ecole Polytechnique Federale de Lausanne, Switzerland

Visual affordance segmentation identifies image regions of an object an agent can interact with. Existing methods re-use and adapt learning-based architectures for semantic segmentation to the affordance segmentation task and evaluate on small-size datasets. However, experimental setups are often not reproducible, thus leading to unfair and inconsistent comparisons. In this work, we benchmark these methods under a reproducible setup on two single objects scenarios, tabletop without occlusions and hand-held containers, to facilitate future comparisons. We include a version of a recent architecture, Mask2Former, re-trained for affordance segmentation and show that this model is the best-performing on most testing sets of both scenarios. Our analysis show that models are not robust to scale variations when object resolutions differ from those in the training set.

Results unoccluded setting

Jaccard index on UMD testing set
Model name	graspable	cut	scoop	contain	pound	support	wrap-grasp	average
CNN	56.83	71.71	32.48	82.35	50.59	36.59	71.50	57.44
AffordanceNet	52.43	72.14	32.16	84.59	71.20	47.24	83.58	63.33
DRNAtt	59.64	74.08	40.33	84.69	71.64	71.71	84.44	69.50
M2F-AFF	68.73	87.43	56.52	85.57	66.72	79.42	86.66	75.86

Results hand-occluded setting

Jaccard index on CHOC-AFF, CCM-AFF, HO-3D-AFF testing sets
Testing set	Model name	graspable	contain	arm	average
CHOC-B
	RN50F	93.27	83.27	-	-
	RN18U	93.45	79.95	93.24	88.88
	DRNAtt	93.63	83.88	94.30	90.60
	ACANet	93.88	85.17	93.24	90.76
	ACANet50	94.00	85.57	93.70	91.09
	M2F-AFF	95.48	88.61	95.36	93.15

CHOC-I
	RN50F	92.20	68.73	-	-
	RN18U	92.94	68.04	93.78	84.92
	DRNAtt	92.85	66.13	94.07	84.35
	ACANet	93.11	69.86	93.90	85.62
	ACANet50	93.37	72.66	94.07	86.70
	M2F-AFF	95.26	77.62	96.04	89.64

HO-3D-AFF
	RN50F	18.14	73.56	-	-
	RN18U	64.79	78.42	32.73	58.64
	DRNAtt	38.54	18.25	0.32	19.04
	ACANet	73.93	73.07	40.00	62.33
	ACANet50	58.40	64.43	39.36	54.06
	M2F-AFF	37.35	65.24	34.10	45.56

CCM-AFF
	RN50F	6.09	10.61	-	-
	RN18U	13.20	22.28	27.68	21.05
	DRNAtt	6.35	0.00	0.23	2.19
	ACANet	10.06	25.83	31.00	22.30
	ACANet50	8.12	17.43	32.54	19.36
	M2F-AFF	30.49	44.27	53.32	42.69

Model name	UMD	CHOC-AFF
CNN	.zip
AffordanceNet	.zip
ACANet		.zip
ACANet50		.zip
RN50-F		.zip
ResNet18-UNet		.zip
DRNAtt	.zip	.zip
Mask2Former	.zip	.zip

Reference

If you use the code, or the models, please cite the following reference.

Plain text format

        T. Apicella, A. Xompero, P. Gastaldo, A. Cavallaro, Segmenting Object Affordances: Reproducibility and Sensitivity to Scale, 
        Proceedings of the European Conference on Computer Vision Workshops, Twelfth International Workshop on Assistive Computer Vision and Robotics (ACVR),
        Milan, Italy, 29 September 2024.

Bibtex format

 
        @InProceedings{Apicella2024ACVR_ECCVW,
            title = {Segmenting Object Affordances: Reproducibility and Sensitivity to Scale},
            author = {Apicella, T. and Xompero, A. and Gastaldo, P. and Cavallaro, A.},
            booktitle = {Proceedings of the European Conference on Computer Vision Workshops},
            note = {Twelfth International Workshop on Assistive Computer Vision and Robotics},
            address={Milan, Italy},
            month="29" # SEP,
            year = {2024},
        }

Segmenting Object Affordances:
Reproducibility and Sensitivity to Scale

T. Apicella^1,2, A. Xompero², P. Gastaldo¹, A. Cavallaro^3,4

Results unoccluded setting

Results hand-occluded setting

Available models

Poster

Reference

Contact

Segmenting Object Affordances:Reproducibility and Sensitivity to Scale

T. Apicella1,2, A. Xompero2, P. Gastaldo1, A. Cavallaro3,4

Results unoccluded setting

Results hand-occluded setting

Available models

Poster

Reference

Contact

Segmenting Object Affordances:
Reproducibility and Sensitivity to Scale

T. Apicella^1,2, A. Xompero², P. Gastaldo¹, A. Cavallaro^3,4