Automatic system to identify and manage garments for blind people

In recent years, there has been increased attention towards the integration of handicapped individuals in society, with significant efforts being made to promote their inclusion. Technology has played a critical role in this effort, with several technological solutions emerging to help handicapped people in their daily routines, enabling them to better integrate into society. However, there are still challenges that remain, particularly in the area of basic tasks for blind people, such as managing and identifying personal garments. This study seeks to provide an improvement in the quality of life and well-being of blind people. A mechatronic automatism that allows the identification of user garments using sensors was developed. An interface developed with the implementation of a server responsible for managing the requests from the user is integrated. Algorithms were implemented for segmentation and classification of garments and for detecting the predominant colors of each garment. By the results obtained it was evidenced that the system could be an efficient solution to reduce the time taken for garment selection, particularly in terms of


INTRODUCTION
In recent years, there has been a concerted effort to develop technologies that provide people with disabilities access to information and knowledge [1]. For individuals who are blind, several technological solutions have been proposed to assist with daily routine activities [2], [3]. Despite the advancements in assistive technology, blind individuals still face difficulties in performing basic daily tasks, such as selecting their clothing. The process of identifying features on garments is often slow and challenging, leading to a loss of autonomy in choosing their desired clothing. This study aims to address these challenges by proposing an automatic wardrobe that improves quality of life and well-being of blind people, complementing the continuous work developed under the same project [4]- [11], [14]. In this research, the major contribution is the development of a mechatronic system prototype that facilitates garment selection and management. The prototype is divided into two distinct modules: the first module is dedicated to the physical prototype, while the second module is dedicated to image processing and machine learning algorithms to segment, classify, and extract colours from clothing. Furthermore, within this scope, this work had the support of the Association of the Blind and Amblyopes of Portugal (ACAPO) allowing a preliminary validation.
This paper comprises seven sections, where Section 2 outlines the prior research. Following this, Section 3 presents a general overview of the system, while Sections 4 and 5 discuss the hardware and software architectures, respectively. Section 6 provides an initial analysis of the results obtained and presents concluding remarks as well as suggestions for future research.

ABSTRACT
In recent years, there has been increased attention towards the integration of handicapped individuals in society, with significant efforts being made to promote their inclusion. Technology has played a critical role in this effort, with several technological solutions emerging to help handicapped people in their daily routines, enabling them to better integrate into society. However, there are still challenges that remain, particularly in the area of basic tasks for blind people, such as managing and identifying personal garments. This study seeks to provide an improvement in the quality of life and well-being of blind people. A mechatronic automatism that allows the identification of user garments using sensors was developed. An interface developed with the implementation of a server responsible for managing the requests from the user is integrated. Algorithms were implemented for segmentation and classification of garments and for detecting the predominant colors of each garment. By the results obtained it was evidenced that the system could be an efficient solution to reduce the time taken for garment selection, particularly in terms of color differentiation and the selection of combinations for blind people.

PREVIOUS WORK
Existing solutions for helping blind people choosing their garments rely primarily on the concept of smart wardrobes, which has seen a significant surge in popularity in recent years [12]. Furthermore, it was identified by Perry et al. the level of acceptance of smart wardrobes and the consumers' opinions about this approach [13]. The study revealed that 84% of the participants identified ease of use and utility as the predominant factors for accepting this technological model.
A survey carried out with ACAPO (Portuguese Association of Blind and Amblyope People) allowed the identification of several problems regarding garments identification by blind people [14]. This survey was important to identify that there isn´t a solution to help blind people on garments management. MyEyes, proposed in [4], [5] by Rocha et al., was developed to overcome this gap. The solution integrates a mobile application with an Arduino board, and it allows the users to have a virtual wardrobe with their personal garments. Near Field Communication (NFC) technology is used to allow the addition of garments. As future development of this work, a possible physical implementation of a wardrobe to help blind people was presented in [10]. Additionally, R. Alabduljabbar et al. presented a system that integrates NFC technology with a smartphone, allowing visually impaired people to choose the desired clothes [15]. A similar solution was proposed in [16] by S. J. V. Gatis Filho et al. that also explores NFC technology combined with Quick Response (QR) technology with the main goal of developing a clothing matching system with audio description.
Solutions whose main target is people without any disability are in a crescent rising with more implementations emerging. Goh et al. [17] proposed a system that integrates tags with Radio Frequency Identification (RFID) technology, allowing unique identification of clothing items. This system is controlled by an application that allows garment management and suggests clothing items based on several criteria, such as style, colour, material, and user's mood. On other hand, some fashion brands present different solutions developed to help people plan what to wear. In 2017, Amazon presented the Echo Look [18] based on a kit with a camera that allows garments photo capture and cataloguing of outfits. This method also suggests combinations based on meteorology and users' preferences. The mobile application Fashion API [19] is a "closet" that plans what to purchase and adds garments based on QR code reading. Another mobile application is the Smart Closet [20], which plans combinations to wear and allows the addition of clothing items based on photo capture. TailorTags [21] is a system that uses smart tags to detect garments automatically that suggests combinations based on user's preferences. The systems presented have the emphasis on solutions based on clothing implementations, not only targeted to blind people, but also showing some examples of what has been made available to the general public, in order to assist and facilitate the selection of clothing pieces. Table 1 summarizes the characteristics of the systems analysed.
Besides all efforts to develop systems to help blind people, except for solutions presented in [4], [5], [15], and [16], the systems in Table 1 focus on solutions that help people without any disability. As the project encompasses the recognition and identification of garments' colours, it was necessary to identify solutions that help blind people in the detection of the colours. In this way, there are some implementations from mobile applications to small electronic devices. Regarding to mobile applications, emerged V7 Aipoly [22] that allows the identification of not only colours but also objects and texts. The system helps blind people on daily routines tasks and interacts with the user by audio description.
Relatively to physical devices, there are ColorTest2000 and Colorino. The ColorTest2000 [23] allows the identification of over 1700 different colours of daily objects and Colorino [24] has the capability of distinguish 150 tones of colours. Yang, Yuan, and Tian presented in [25] a prototype that analyses garment and allows the identification of 4 different types of patterns and 11 different colours. The system integrates a camera, a microphone, a computer, and headphones. For the colour identification it is used the Hue, Saturation, Intensity (HSI) model and implemented machine learning algorithms to recognize the patterns. In [26], J. Jarin et al. proposed a similar system capable of detecting colours and patterns in clothing pieces. The garments photos captured by the camera are submitted to a Support Vector Machine (SVM) algorithm to identify and classify the different colours and patterns on each piece. In the case of the proposal presented in [27] by X. Yang et al., it is similar with the previous solution, however, uses Statistical Feature (STA) and SIFT (Scale Invariant Feature Transform) to identify the colours and patterns in each garment. Lastly, Medeiros, J. presented in [28] a small wearable device with an endoscopic camera to use in the top of a finger that allows the surface capture of garments images. To identify and classify different textures, there is used classification approach based on the combination of two complementary features that uses ImageNet classifier and an SVM. The colour detection is based on Superpixel Segmentation capable of detecting multiple colours simultaneously.
Regarding the scope of garments segmentation, there are several solutions presented. Another capability of the system presented in this paper is the classification and segmentation of garments. For the system implementation the algorithm used was the Mask Region-based Convolution Neural Network (Mask R-CNN) that mainly considered image segmentation. It was necessary to identify implementations using this algorithm on garments scope. The solution approached by Khurana, T. et al. in [29] allows the distinction of garments with similar colours and patterns. Firstly, the CNN is used to detect and segmentate the Table 1. Analysed solutions overview.

Solution Description
MyEyes [4], [5] Manage garments with RFID "An IoT smart clothing system for the visually impaired using NFC technology" [15] Manage garments based on NFC technology "My best shirt with the right pants: improving the outfits of visually impaired people with QR codes and NFC tags" [16] Manage garments based on the combination of NFC and QR technology "Developing a smart wardrobe system" [17] Adds clothing items to the wardrobe based on RFID tags reading Echo Look [18] Suggests advice based on weather and personal trend Fashion API [19] Adds clothing items to the virtual wardrobe based on QR code reading Smart Closet [20] Adds clothing items to the virtual wardrobe based on photo capture TailorTags [21] Adds clothing items to the virtual wardrobe based on wireless tags detection spatial limits of each garment and next, the features are identified.  [33] capable of detecting t-shirts and long sleeve shirts silhouettes using instance segmentation with Mask R-CNN. The dataset encompasses 9000 images with the annotations. The results showed that the model presented good results in the detection of t-shirts with an average precision of 95 %. Furthermore, several studies [34]- [37] presented methods to segment objects using CNNs which have similarities to the implementation used and modelized for the garment segmentation and classification. Table 2 summarizes the main results of the segmentation and classification methods analysed.
The proposed system's primary objective is to identify and classify clothing items for blind individuals to select their clothing with ease. To achieve this goal, a thorough analysis of the current state-of-the-art solutions was conducted, revealing a significant gap to assist visually impaired individuals in this task. Thus, a new system is proposed, utilizing a physical prototype with advanced artificial intelligence (AI) algorithms to classify clothing items and detect their predominant colours, enabling blind individuals to make informed clothing choices. The proposed solution represents a significant advancement in assistive technology for the visually impaired and has the potential to improve the daily lives of them.

SYSTEM OVERVIEW
The system presented in this paper consists of a physical prototype and control software, Figure 1. The hardware includes an NFC module that reads the garment tags, a stepper motor screwed to the wardrobe roof for circular movement, and a servo motor on each hanger for 180-degree rotation during photo capture. Two photos (top and bottom) are taken and stitched together using an algorithm. A Raspberry Pi serves as the control unit to process data and provide analysis on request. An inside illumination system reduces reflections and creates a controlled environment for photo capture. Figure 1 depicts the mobile application as the intermediary between the user and the physical prototype, facilitating the interaction with the Raspberry Pi through the installation of a server that handles user requests. At the hardware level, the Raspberry Pi consists of three main components: the NFC Reader, cameras, and DC motors, as shown in Figure 9. The Raspberry Pi triggers the corresponding hardware components to capture garment photos and return them to the user in response to their requests. The software module includes two algorithms designed to determine the predominant colours of each garment. A Mask Region-based Convolutional Neural Network (R-CNN) algorithm is used to carry out garment image segmentation and background removal, using a dataset of 480 annotated garment images. Additionally, an algorithm based on OpenCV analyses the pixels of the images to extract their Red-Green-Blue (RGB) values and determine the predominant colours. The upgrades between the two systems are significant. While the previous system [4], [5] used NFC communication for garment management, the proposed system combines this functionality with an automation mechanism that can move and rotate garments to capture multiple angles accurately. The integration of a photo capture module further enhances the system's ability to detect and classify garments, allowing the system to determine the predominant colours of each piece accurately. The proposed system's enhanced capabilities represent a significant advancement in the field of garment management and classification, having also potential applications in various industries.

HARDWARE ARCHITECTURE
The functioning of the system has four distinct phases: NFC reading, photo capturing, hanger rotation, and circular movement. The NFC module is used to read tags attached to each garment, allowing for the collection of the identifier number (UID), which is the user identification code associated with each tag. Photo capturing is carried out via a vision system with one camera. Capturing a complete photo of a clothing item can be challenging, even with a camera equipped with a larger aperture lens, due to the short distance between the camera and the item in the developed prototype. To address this limitation, a servomotor is attached to the camera to facilitate a 180-degree rotation, allowing the capture of multiple positions of the item, which are subsequently merged using a stitching algorithm to create a complete photo of the garment, combining both top and bottom photos. This innovative approach represents a solution to accurately capturing clothing items in the prototype and can be implemented in various wardrobe sizes, making it a versatile solution. During the photo capture process proper illumination is crucial to avoid dark areas. To achieve this a white LED strip consisting of hundreds of LEDs emitting light at a colour temperature of approximately 6000K is mounted on the wall where the cameras are situated. This type of illumination offers a diffuse, flexible, and cost-effective solution, allowing for a higher focus on each garment. For a circular movement within the wardrobe, a stepper motor is affixed to the roof, with its shaft connected to a circular platform. This platform supports the weight of all servomotors attached to each hanger, responsible for rotating each garment 180 degrees. The servomotors are attached to the underside of the circular platform providing a flexible and cost-effective solution.
To establish a proof of concept and create an easily mountable solution, a small IKEA wardrobe with dimensions of 50 cm × 30 cm × 80 cm was selected for testing garments of appropriate size [38]. The circular movement of the garments inside the wardrobe is enabled by a stepper motor, installed on the roof, which drives a circular platform capable of supporting the weight of all servo motors attached to each hanger. The servo motors, responsible for rotating each garment 180 degrees, are mounted on the inferior surface of the circular platform. The NFC reader is attached to one of the side walls to allow for collision and detection of the UID associated with each tag. Figure 2 represents the hardware overview.
The system was implemented using the Raspberry Pi 3B+ microcontroller board as its base. This board is a popular choice for complex projects due to its versatility and affordability. The board has 40 pins that can be used for various functions such as I2C, SPI, and UART communication. It has 1GB of RAM and a 16GB SD card with the Raspbian Operating System installed for storage. To read the tags, a PN532 module that communicates via NFC was used. This module can detect tags up to 4 cm away and can be connected to the Raspberry Pi board using I2C, SPI, or UART.
The system considers two types of actuators -servomotors and a stepper motor. Servomotors are affixed to the hangers, enabling 180-degree garment rotation for photo capture as depicted in Figure 3. Additionally, a servomotor connected to the camera enables two positions during photo capture. On the other hand, the stepper motor is attached to the wardrobe's roof, facilitating the garment movement. It is controlled by a ULN2003 driver, which governs the motor's rotation. To take photos, an OV5647 camera module was used. It integrates the camera and a small board for connection to the BCM2835 processor through Channel State Information (CSI) communication. This camera has a 5 MP resolution and can capture images up to 2592 × 1944 pixels. The CSI bus was used to connect this module to the Raspberry Pi.

SOFTWARE ARCHITECTURE
Regarding the software developed and implemented, the system has three main components: server, control software automatism and image processing algorithms (Figure 4). Concerning the interaction between the user and the system, a   server was implemented on the Raspberry Pi. The server receives the user's requests and returns the photo from the respective garment. This communication allows the connection between the user interface and the physical system.
The server was implemented using Flask framework and hosted on a Raspberry Pi. The server received photo requests from the user, and a virtual environment was created using Docker platform to enable the use of Detectron2 on a Windows machine for background removal. The client on the Docker received garment photos from the user and returned the respective class and predominant colours.
A control software was developed to test and validate the wardrobe automation. The state machine, presented in Figure 5, described the system's functioning, which included six states: READY, STEPPER, DETECTION, CAPTURE, SERVO, and STITCH. When the user requested a garment, the initial state READY changed to STEPPER. In this state, the stepper motor controlling the garment movement inside the wardrobe is triggered, moving until the tag with the requested UID is detected and stopped. The state then changed to DETECTION, and the UID tag is displayed. The system then changes to the CAPTURE state, and top and bottom front photos of the garment are captured. Afterward, the system moves to the SERVO state, where the servo motor is triggered, performing a 180 degrees rotation. After the rotation, the system returns to the CAPTURE state and captures top and bottom back photos of the garment. The system then changes to the STITCH state, where a stitching algorithm, which is part of the OpenCV library, is used to combine the photos. This stitching algorithm solved the issue of not being able to capture the full view of each garment with only one camera shot by combining the images based on keypoint detection and overlapping common points on each picture.

Background removal
AI is integrated into the system to accurately classify the type of garment, which is a critical feature since the user may not be aware of all the clothing items inside the closet. NFC communication is utilized only to select the clothing item and identify its respective UID for analysis. After the item is selected, the system performs segmentation and background removal to facilitate identification of the dominant colours of the garment. This process is necessary to determine the colours requested by the user accurately. The integration of AI, NFC communication, and image processing techniques allows the system to identify and classify clothing items accurately, which represents a significant advancement in the field of clothing management and classification. To accomplish this task, a method was needed to remove the image's background while considering the garment's boundaries. After conducting multiple studies and research, a deep learning algorithm called Mask R-CNN was employed to segment the images. To model and train the algorithm, Google Colab framework was utilized, providing high processing power for the neural network training on a local machine. To create the dataset, photos were gathered and divided into six types (classes) of garments: pants, shorts, t-shirts, shirts, polo shirts, and dresses. As there were not enough photos available, several images were obtained from an online dataset [39]. The final dataset comprised a total of 480 garment photos, divided equally among the six classes with 80 photos per class. To ensure the accuracy of the garment boundaries, the images were annotated with the Visual Geometry Group's Annotator tool. Following the annotation process, a function was developed to load the annotations in JSON format from the dataset. This function assigned each annotation to the corresponding class, as well as other attributes, such as region_attributes, shape attributes, and x/y coordinates from the bounding box. Transfer learning was employed in the training process, which involved using a pretrained model. The dataset was then loaded, and several parameters were adjusted to control the flow of training.
The learning rate was set to adjust the speed at which the neural network converges to the goal. This involved a trade-off between low values, which could slow down the training, and high values, which could cause the model to diverge from the goal. The number of iterations was adjusted to maximize the Average Precision (AP) and Intersection Over Union (IoU) as well as to prevent overfitting. The neural network was trained using specific parameters and further tested with various garments to ensure its accuracy. A function was developed to segment and classify garment images uploaded by the user, which takes an input image and produces three output images: the segmented image, the final image with the background removed, and the corresponding binary mask. Additionally, a colour detection algorithm is used to analyse the pixels from the region of interest in each image, as shown in Figure 6.  Figure 6. Image segmentation flowchart.

Colour detection
After the background removal, the garment image undergoes a colour detection algorithm that detects the predominant colours in the image using the OpenCV library. The algorithm analyses the RGB values for each pixel in an image, ranging from 0 to 255, to determine the colour based on a combination of the three components. The process begins by defining two arrays containing colour names and their respective RGB values. The algorithm then utilizes a binary mask and final image, generated by a deep learning algorithm that removes the background, to extract the pixel values from the final image, corresponding to the non-black coordinates of the binary mask. To determine the closest colour in the colour array, each pixel within the clothing region of interest is saved, and the frequency of each colour is counted to calculate the percentage of each colour present in the garment. To avoid potential segmentation errors, colours with percentages below 10 % are not considered. By focusing only on the dominant colours of the clothing item while minimizing the impact of segmentation errors on the colour analysis results, this approach ensures that clothing colour analysis is carried out exclusively on the pixels identified as part of the clothing region of interest.
This section presents the results obtained to confirm and validate the proper functioning of the different modules previously introduced [40]. In addition, the flowchart of the control software developed, which integrates the hardware and software of the wardrobe automatism, is presented in Figure 7.
The pictures of a garment captured by the mechatronic system, which are subsequently combined by the stitching algorithm, are represented in Figure 8. The automatic wardrobe system was designed to provide controlled photo capture, thereby eliminating dark fields and shadows. A test was conducted on a garment to demonstrate the accuracy of the photo capture control system. Figure 9a) and Figure 9b) depict photos captured outside the controlled system, which exhibit shadows and dark fields. In contrast, the photo captured inside the automatic wardrobe, as shown in Figure 9c), is properly illuminated, highlighting the garments features and eliminating any reflection.
The background removal algorithm utilized the Mask R-CNN neural network, and several training sessions were carried out to optimize all parameters for maximum Mean Average Precision (mAP) and IoU. Initially, the model was trained using a batch size of 128, which was later reduced to 64. During the training process, a large number of iterations were carried out to determine the point at which the model started overfitting. The mAP results for each class after 5000 iterations are presented in Table 3. Mean Average Precision (5000 iterations). .
As can be seen from Table 3. Mean Average Precision (5000 iterations).
, the value obtained for the Dress class is not as expected, as the model has difficult in recognizing the dress silhouette correctly, given that the dresses in the dataset do not have a similar shape. Subsequently, by adjusting the number of iterations to 3250, the results improved, with the mAP value    increasing to 85.667 %. This increase was largely due to the improvement in the Dress class value, Table 4. Subsequently, a test was carried out with a slightly lower number of iterations, i.e., 3000 iterations. In this test, the overall mAP value decreased slightly, however, the value obtained for the Dress class increased, approaching the other class values (Table 5).
Thus, two trainings were conducted with a number of iterations of 2750 and 2500 (2 nd and 3 rd rows, respectively), as presented in Table 6. Table 6 illustrates that the model's performance starts deteriorating as it enters the underfitting phase after 2500 iterations. Although the mAP value obtained for 2750 iterations was reasonable, the Dress class's mAP fell short of expectations, being considerably lower than the other five classes. Furthermore, the IoU values were also obtained during the previously performed trainings, as can be seen in Table 7. To obtain this metric, a function was used that allows the calculation of IoU values with a variable threshold between 0.5 and 0.95, spaced by 0.01. The function goes through the 48 images of the validation dataset and obtains the respective IoU value for each photo, comparing the predicted mask with the real mask obtained (from the application of the Neural Network). In this calculation, the two masks are intersected, and the area of intersection and union between the two masks is obtained, and the corresponding metric is calculated.
Following the validation of the developed automation, the background removal and colour detection algorithms were tested using several images. Three garment photos were used in this testing phase, with one captured within the prototype and the other two taken using a smartphone. The first photo, which is depicted in Figure 10, revealed that the segmentation algorithm was not perfect. Specifically, the model wrongly identified some background pixels as Region of Interest, especially around the hanger. To address possible errors resulting from poor segmentation, the colour detection algorithm only considered percentages above 10 %. The percentage analysis revealed a predominance of Cadet Blue (40 %) with two other colours, Slate Grey (20 %) and Sea Green (12 %) ( Figure 11).
The green colour corresponds to the background pixels due to the difficulty in accurately distinguishing the region of interest from the background, especially near the hanger. Additionally, two additional photos were captured using a smartphone, as shown in Figure 12 and Figure 14. Figure 12 showcases a precise segmentation, with the colour distribution divided into three colours: Dark Cyan (32 %), Light Sea Green (27 %), and Teal (19 %) ( Figure 13). The remaining 22 % encompasses colours with low percentages that are irrelevant for this analysis, such as background pixels that were erroneously considered as Region of Interest, especially around the garment's edges. In the latest test, the segmentation algorithm was capable of distinguishing the Region of Interest from the background. The colour analysis shows the following colour distribution: Rosy Brown (50 %), Grey (29 %), and Dark Grey (17 %) ( Figure 15).
As previously stated, one of the goals of this study was to validate the system with the blind community. To achieve this, a preliminary validation was conducted through an interview with  Figure 10. T-shirt segmentation and classification. Figure 11. T-shirt colour distribution.   a blind representative from the ACAPO association. The purpose of the interview was to address important questions regarding the use of such systems by visually impaired individuals. The interview began with an overview of the system to introduce all the relevant details. Following this, several questions were posed to identify the challenges blind people face when selecting and managing garments. The study revealed that distinguishing between different colours on each garment was a significant challenge. However, insights gained from the validation process suggest that presenting the location of each colour would be more relevant for blind users than showing the percentage of each colour. Moreover, it was suggested to include audible feedback during the selection process, such as a gradual rising sound or a voice system to indicate the start and the end of the process.

CONCLUSIONS AND FUTURE WORK
The scope of this work is integrated in the field of technology that can be used to assist human necessities [41]- [43]. The primary objective of this proposal paper was to present a system that assists blind people in choosing their clothes. The design of the prototype was the first part of the project. It required finding on the market a wardrobe that would allow the proof of concept in a home environment. To develop the system, it was necessary to study the requirements for its operation, particularly with regard to lighting, tag reading, rotation and movement of the clothes, and image capture. To operate these modules, an algorithm was developed to control each of the existing electronic elements. Following the validation and individual testing of each module, they were integrated into a single system controlled from a command line interface. To obtain the predominant colours of the clothes, colour detection and background removal algorithms were implemented, with the latter using a Neural Network. The prototype presented is part of a larger project under development, with its future integration planned for a mobile application as part of the MyEyes system [4], [5]. In this regard, a server for user request management has already been implemented. This integration will allow for the replication of the physical wardrobe on a mobile device, enabling clothing selection with a single click. In summary, it was evidenced that the system may be an efficient solution to reduce the time taken for garment selection, particularly in terms of colour differentiation and the selection of combinations for blind people.
Although, as the developed prototype is a proof of concept, the system has some limitations, including the reduced size of the wardrobe. This feature allowed only for the use of small-sized clothes and a limited number of items inside the wardrobe at a time. All developed electronics were designed for specific use in this small prototype and will need to be resized and relocated for the development of a prototype with larger-sized clothes. When integrating the physical prototype of larger dimensions with MyEyes [4], [5], a more comprehensive validation will be conducted with the blind community, allowing to test the developed system with a view to its possible commercialization.