1. Introduction
Stability testing or product safety testing encompasses many different aspects of a product formulation. There needs to be proof that the preservative system is efficient, that the product is physically stable, and that the product does not negatively interact with the packaging. Preservative systems are tested through challenge testing, where samples of a product are inoculated with different types of bacteria [
1]. The physical product stability is tested by placing samples in inert glass jars and subjecting them to different environment conditions, this type of testing is the main focus of this paper. Product packaging is tested by filling the agreed upon final packaging with the product and again subjecting it to different environment conditions, further testing is also done to assess the packaging functionality with the type of product.
There are many guidelines as to protocols that can be followed when designing a suitable stability testing regime from the established to the more recent thinking. This paper will examine those recommended guidelines in the face of a changing retail landscape. It will explore the International Federation of the Societies of Cosmetic Chemists (IFSCC) monograph and recommendations from Cosmetics Europe, through to more recent work such as that of the UK-based Centre for Process Information (CPI). There have been significant changes in the retail landscape over the decades as the direct-to-consumer (DTC) model has gained prominence. In addition to the DTC model there has been a shift in so called green consumer behaviour, with consumers and producers being more open to the usage of natural cosmetics [
2]. This can give rise to additional challenges in stabilising formulations as considered by Singh et al. with their work examining the carrot seed oil-based emulsions [
3]. For the purpose of this paper, however, all cosmetic preparations will be considered equally. The paper seeks to examine the protocols available and consider their suitability in the face of the shift in retail behaviour.
2. A Background to Stability Testing
Stability is defined in Cambridge dictionary as a situation in which something is not likely to move or change [
4]. Transposing this to a context of a cosmetic product, this can be defined as remaining within a set specification. Specification might include various characteristics like appearance, pH, viscosity or efficacy. They need to be measurable and applicable to a product type. An important specification attribute is microbial stability. Ensuring there is no growth of bacteria or other microorganisms in the product throughout its shelf life is one of the qualities ensuring consumers’ safety. Stability assessment preserves the reputation of the brands by making sure products are aesthetically acceptable for consumers [
5,
6]. Testing of stability helps to establish shelf life of a product during which product continues to be safe and fit for use.
As defined by European Cosmetic Regulation (EC) No 1223/2009, Part A of a Cosmetic Product Safety Report shall contain physical/chemical characteristics and stability of a cosmetic product. Part B of the report which is a safety assessment requires to consider impacts of stability on the safety of cosmetic product. However, the regulation does not specify requirements of how a stability test should be performed. A stability test is essential to evaluate a product’s shelf life. There are stipulations in Article 19 of the Regulation about clearly labelling a product with the date of minimum durability if this is less than 30 months. For products with a minimum durability date over 30 months an indication of the period of time after opening (PAO) shall be made instead [
7]. No specific instruction on how to calculate PAO is provided in the Regulation. However, microbiological stability as well as packaging of a product should be considered during PAO determination process.
Following the exit of the United Kingdom from European Union in 2020, cosmetic products in the UK are regulated by Schedule 34 of The Product Safety and Metrology, etc. (Amendment, etc.) (EU Exit) Regulations 2019, also called the UK Cosmetics Regulation. The stipulations regarding stability testing are the same as of (EC) No 1223/2009 [
8].
In the United States of America cosmetic products are regulated by Food and Drug Administration (FDA) with the Federal Food, Drug and Cosmetic Act (FD&C Act). The FD&C Act prohibits the distribution of cosmetics which are adulterated or misbranded. FD&C Act does not state any requirements regarding product stability or shelf life [
9]. However, as the product must be safe it is a responsibility of the manufacturer to verify product’s shelf life [
10]. Products which fall under the category of an Over-the-Counter Drug in US, like sunscreen or anti-acne treatments, must conform to Current Good Manufacturing Practice for Finished Pharmaceuticals that provides some specifics regarding testing [
11]. Further details on storage conditions, testing frequency and more are provided in a supplementary guidance for the industry in Q1A(R2) Stability Testing of New Drug Substances and Products [
12]. Comparably to the US, Canada does not set rules or guidelines for stability testing of cosmetic products but defines set of guidelines for drug products in Guidance for Industry Stability Testing of New Drug Substances and Products [
13].
Association of Southeast Asian Nations (ASEAN) market, primarily following EU Cosmetic Regulation, in the Guidelines for the safety assessment of a cosmetic product specifies stability needs to be considered [
14]. Stability needs to be provided in Part III: Quality Data of Finished Product as part of Product Information File [
15]. However, no exact details on test protocol are provided.
3. Current Recommended Guidelines for Stability Testing
Due to the extremely wide variety of products produced in the personal care industry there is no single stability testing procedure that is required for manufacturers to follow when producing a new cosmetic product. Alternatively, there have been a number of recommended guidelines published by global cosmetic associations such as the IFSCC and Cosmetics Europe, these publications suggest protocols to follow when carrying out stability testing. In 2018 the British Standards Institution published ISO/TR 18811:2018 Cosmetics–Guidelines on the stability testing of cosmetics [
16]. The document does not aim to specify how a stability test should be performed but does review the information provided in previously published documents on stability testing such as the documents published by the IFSCC and Cosmetic Europe. The ISO is a good starting resource to help manufacturers select the correct protocols for designing a stability test.
The IFSCC is a worldwide federation whose purpose is to promote international cooperation within the personal care industry [
17]. The IFSCC published a monograph in 1992 titled “The Fundamentals of Stability Testing” [
18] which covers the types of tests and varied conditions that they recommend being used when performing a stability test.
Cosmetics Europe are a trade association specifically for personal care manufacturers within Europe, they are a membership-based association and provide expert knowledge on European legislations and developments within the industry [
19]. In 2004 Cosmetics Europe published a report titled “Guidelines on Stability Testing of Cosmetic Products” [
20] which sets out guidelines to predict and guarantee the stability of cosmetic products in the market. Both publications by the IFSCC and Cosmetics Europe set out similar guidelines as there are certain tests that have been identified as the best way to test stability, however it is usually down to the manufacturer to design the specifics of the test, hence the need for guidelines.
There are multiple different reasons why a product would need to be stability tested, the IFSCC recommends that the purpose of the test should be identified prior to starting testing so that a sufficient test procedure can be followed. Some of the reasons why a product would require stability testing include, assessment of a new product development (NPD) formulation, assessment of an NPD formulation with its packaging, if a method or formulation has been modified from the original, or if the product container changes [
20].
The main test the IFSCC cover is the standard stability test, this is the fundamental test that cosmetic products undergo when performing stability testing. Samples are placed in different temperature environments and the product reactions to these conditions are observed over a set amount of time. Warmer temperature conditions will accelerate any reactions that may occur under normal shelf-life conditions, at a molecular level approximately a 10 °C temperature increase will double the rate of reaction. However, this rule does not accurately apply to more complex systems such as cosmetics, but an increased reaction rate of any amount is beneficial in personal care as it is a fast-moving consumer goods (FMCG) industry. The shelf life of a product can last up to 2–3 years, but the time frame of the development of a product from brief to launch is much shorter, usually around 6 months to 1 year, so accelerated testing that only takes 3–4 months to produce a full shelf-life prediction is extremely useful.
Table 1 details the suggested storage conditions for the accelerated stability test [
18].
The 4 °C sample is usually used as a control sample as the cold temperature will slow down any changes that may occur. This sample and the 20 °C sample are kept on test for the full shelf life of the product so that a real time stability test can also be carried out to confirm the results of the accelerated stability test. For standard tests 45 °C is generally the maximum temperature as testing samples at higher temperatures such as 50 °C, 60 °C and 70 °C, although theoretically would produce quicker accelerated results, is not recommended for standard testing as stated by the IFSCC, the further removed the conditions are from normal everyday conditions, the more adverse changes happen to the product that likely would never occur in normal conditions. However, testing at these temperatures can be a good initial indicator of the stability of a product, if a sample is still stable at high temperatures such as 60 °C and 70 °C then you can be confident the product will be very stable at normal conditions [
12].
Samples need to be observed regularly throughout the testing period for any changes that may occur. The IFSCC recommend that samples are checked frequently at the beginning of a test, generally monthly, but as the test progresses past the first 3 months testing intervals can be spaced further apart. They also suggest that multiple samples of a product should be placed in each condition, enough for each test, so that a new unopened sample is removed from the condition and tested at each interval. When the samples are tested, they should be checked against a control sample for any changes in the appearance, odour, texture, viscosity and pH, the weight loss can also be checked in packaging compatibility samples. Specific parameters will need to be set for each of these properties before testing starts, as the results will likely fluctuate, but the parameters will outline a range that the product is still safe for use and stable between.
Table 2 details further testing that may be necessary depending on product or formulation type [
18].
These types of tests are usually performed on products that may face certain conditions in the market or if a formulation or product is known to have past instability issues. Specifically, cycling and freeze–thaw tests are useful when products are planned to be shipped internationally, as they could be subjected to varying extreme temperatures during transport, so these tests will ensure that the products can withstand the extreme temperature changes.
4. Modern Advances in Stability Testing
The demand for innovative cosmetic products that meet current trends is at an all-time high. Competition between cosmetic brands to unveil pioneering concepts is fierce, with indie labels growing rapidly to contend with the majority market shareholders. Online purchasing is enabling consumers to access more beauty brands than ever before, and purchasers no longer have to depend on the limited offering of their local store [
21].
Consequently, brands now feel more pressure to quickly capitalise on opportunities in the market whilst the trend is still booming. Standard product development timings consisting of thorough evaluations of product stability are no longer acceptable and are now often ruthlessly reduced to meet the expectations of rapid, ground-breaking beauty launches.
Cutting the time allowed to complete accelerated stability testing is risky. Postles established in a review of the current stability testing guidance that the techniques involved in accelerated stability testing are generally unreliable, concluding that the methodology is inappropriate for predicting long-term shelf life [
22]. This deduction was particularly directed towards emulsion technology that founds the basis for many cosmetic formulations. Postles highlighted some potential areas of improvement to the current guidelines in the 2018 study. One recommendation was to encourage the often-overlooked method of real time stability testing to accompany the accelerated testing data. This is indorsed to confirm that the changes observed during the initial program were accurate and to investigate if further instabilities could be expected. It is suggested that real time testing is conducted 6 to 12 months ahead of industrial scale up so that brands can react with speed to any unforeseen changes that may occur when the product is released to market [
22]. In the modern market this process may be considered too time-consuming and may add further delay to launches, so is often disregarded in favour of completing post-market surveillance to track success of a formulation. This begs the question of why an inherently untrustworthy method for testing stability can be condensed to meet fast-fashion principles in the beauty market, and what modern advances in technology can be utilised to strengthen the reliability and accuracy of the accelerated testing method.
An effective approach of developing bespoke cosmetic formulations with consumer perceptible differences is the employment of minimally disruptive formulations (MDFs). The MDF concept was described by O’Lenick and Zhang, and involves manipulating a simple, stable base formulation with low levels (<10%) of different silicone polymers to alter the product aesthetics with little disruption to the base’s stability profile. The selection of silicone polymers is able to provide assortments of feel, playtime and gloss to the base formulation, providing the formulator with confidence that different consumer perceivable aesthetics are achievable with reasonably predictable stability testing results [
23]. The MDF concept can also be extended to include the introduction of low level, novel active ingredients into the base. The engagement of MDFs by beauty brands can dramatically reduce development and stability testing requirements, as well as substantially reduce costs and time frames, without compromising on quality. This method allows brands to quickly react to consumer demands and reduces the risk of missing out on market opportunities.
Advances in technology can also assist with reviewing formulation stability in a timely manner. UK-based CPI have developed their MicroSTAR technology which is a microfluidic platform for stability prediction. This revolutionary, automated method conducts physical testing in shorter timescales to create data for long-term stability prediction with minimal resource and cost. The utilisation of microfluidics inflicts variations of temperature, flow, pressure and vibrations to mimic the diverse environmental conditions that a product may be subjected to during its shelf life. The platform is also capable of revealing insights and drivers of stability failure, which cannot be detected using traditional stability testing methods. CPI offer this technology to enable their clients to conduct the relevant testing in as little as several hours [
24].
An alternative method to early stability prediction is the practice of static multiple light scattering (SMLS). SMLS is a high resolution, optical analysis method that is capable of outlining and measuring the destabilisation characteristics of liquid dispersions. This method is particularly useful for cosmetic emulsions, which are thermodynamically unstable and consequently prone to sedimentation, creaming and flocculation. The testing method is carried out on undiluted samples (dilution can affect the dispersion state) and quantifies the rate of changes in particle size and migration and can provide tangible results long before the changes are perceived by the human eye. This process is much quicker than traditional accelerated stability testing methods and could be employed by formulators to promptly forecast formulation stability with precise results [
25,
26]. Postles advocates the utilization of modern analytical equipment to strengthen shelf-life prediction reliability, stating methods such as examining zeta potential and controlled centrifugation [
22].
5. Changes in the Retail Market
How has the retail market changed over the years and how might this influence product stability protocols? Imogen Matthews considers the flexible approach to logistics in a much-changed market, noting that online retail plays a vital role in keeping consumers supplied with beauty goods. She remarks that essential to the success of any brand is the smooth delivery of products in pristine condition to the retailer, online store or directly into the hands of the consumer [
27]. Looking at the direct-to-consumer (DTC) business model Schlesinger, Higgins, and Roseman noted that for decades across many categories, including beauty, a handful of brands dominated the consumer retail market. They considered that with the rise of the internet and social media platforms came the rise of the DTC model but concluded that ultimately an omnichannel approach is favourable for growth [
28]. This conclusion is supported by a survey McKinsey carried out in 2019 reviewing shopping habits by age group across the cosmetic and skin-care product sectors. They found that while the baby boomer generation had a strong preference to browse and buy in store, generation Z had no strong preference for one shopping habit type, instead preferring an omnichannel approach [
29,
30].
We need also to consider the impact of the COVID-19 pandemic on consumer shopping habits. By June 2020 Cosmetics Business revealed key insights in a COVID-19 strategy report. They stated that the pandemic has placed unprecedented challenges on the beauty and personal care industry, leaving no brand or retailer untouched. They cited data from McKinsey that estimates a decline of between 20–30% for global beauty industry revenues in 2020. Their report considered that the difficulties experienced by bricks and mortar retailers prior to the pandemic were accelerated. They surmised that as physical stores reopen brands and stores would need to develop new ways for consumers to discover their products [
31]. This train of thought is echoed by Lanteri, she notes that for a long time, selling skincare products had largely revolved around brands providing interactive experiences in store [
32]. In addition, McKinsey notes that pre-COVID-19 sales in store accounted for up to 85% of beauty purchases in most beauty industry markets, superseding the appeal of shopping online [
30]. However, the pandemic has accelerated the shift to online for a much broader section of consumers. Culliney considers this too concluding that beauty brands and retailers must blur physical retail with digital experiences to engage consumers in a post-pandemic world [
33]. This opinion is shared by McKinsey in their considerations of the long-term impact of COVID-19 on the beauty industry. They conclude that some changes are likely to be permanent, most notably the rise of digital platforms and the pace of innovations. Their surveys show that across the globe, consumers indicate that they are likely to be increasing their online engagement and spending. They conclude from this that brands will need to prioritise digital channels and overhaul their product innovation pipelines to capitalise on this shift in consumer behaviour [
30].
6. Conclusions
Accelerated stability testing is not a precise science, rather a prediction of shelf life. The varied retail approach coupled with the need for pristine product condition highlights the challenges involved in designing a suitable stability protocol. Considering the DTC model, it is important to note that brands are able to reach a global audience where previously they might have only retailed in bricks in mortar in their immediate territory. It is therefore clear that the protocol must consider various modes of transport and sizes of shipments to potentially global destinations. When a container vessel became stuck in the Suez Canal in March 2021 this highlighted the plight of global shipping. Whilst we may not be able to design a protocol that can account for our product been delayed due to this type of force majeure event we must consider that containers can be delayed in global shipping and as a result can be exposed to extremes of temperature. In addition, new and emerging ways of consumer product interaction must be taken into account. The product must remain in specification, being safe, fit for purpose, efficacious and acceptable for the consumer for the duration of the shelf life in order to preserve brand reputation. Stability testing is essential in evaluating a product shelf life and yet it is not set out in regulation how the test should be performed. Global cosmetic associations IFSCC and Cosmetics Europe have published suggested protocols however these were in 1992 and 2004, respectively. The more recent work of Postles encourages the use of real-time stability testing, however this is suggested to be 6–12 months ahead of an industrial scale up which is at odds with a brands desire to move quickly to market. Traditional accelerated testing could be accompanied by additional tests such as cycling, freeze–thaw and centrifuging. Where speed to market is critical employing the formulating approach of minimally disruptive formulations de-risks the process. The advantage that the DTC model has is that brands are not bound by the shelf-life requirements of a retailer and so have the option to consider a shorter shelf life for launch. Post market surveillance can continue after launch and then the shelf life can be gradually extended once more data are gathered to support this. In conclusion to address the diverse market conditions in the context of stability and shelf life, a diverse set of protocols with the need for more product specific stability strategies would seem to be the most logical approach.