The Industrial Revolution of Retouching
Data-Driven Post-Production for E-Commerce
Developing a Retouching Smart FactoryEverything Pixelz does and is able to do is predicated on the proprietary virtual factory we’ve created in order to industrialize post-production and bring the benefits of lean manufacturing to retouching. The following sections will provide context for the history of Pixelz and the ecommerce image editing industry, and will share what we have learned about best practices in creating a world-leading company by means of manufacturing principles, data science, AI & automation, gamification, and smart factory principles.
Introduction: Condensing centuries to a decade
In the 1700s, the steam engine gave birth to an industrial revolution, enabling the mechanization of production processes. A century later, the first conveyor belt was created, marking the beginning of the second industrial revolution. Another century later, in the mid-1900s, the first Controllable Logic Programmer (CLP) was created, which enabled the rise of mass production via automation, computing, and robotics.
Industrial production remains the main driver for innovation, growth and job creation in most countries of the world. In recent decades, the evolution of industrial manufacturing has become focused on “Smart Factories” characterized by connectivity of people and machines, and autonomous intelligent systems which can make decisions using enormous amounts of stored and real-time data, all of which streamlines the distribution process into its most optimal form. In recent years, the trajectory of intelligent manufacturing has focused on flexible and personalized customization of production orders, allowing each customer’s specific needs to be achieved without negatively impacting production efficiency.
Although this industrial evolution occurred over the span of a few centuries, Pixelz has experienced similar evolution in less than one decade. In Pixelz’ earliest days, like most post-production companies, Pixelz received orders from customers and one employee would work on one image from start to finish. There was no virtual assembly line, no automation, no data-capturing or intelligent systems. To progress, Pixelz has invested in hundreds of thousands of developer hours to create an intelligent virtual factory, a proprietary system called S.A.W.™, “Specialist Assisted Workflows.” Today, this system is the central engine that drives our company forward.
Simply put, (S.A.W.™)is a virtual assembly line for editing images. Much like the history of industrialization, the assembly line was created first, and continued evolution has involved the collaboration of humans and machines to capture data and automate processes, which enables:
Normally, these four attributes work against each other in a manufacturing facility. For example, a focus on faster delivery times can lead to reduced quality, and personalized orders can reduce delivery time, and mass production inhibits personalization. However, when the assembly line is combined with intelligent human-automation-hybrid systems, these four attributes begin to work in tandem and support each other with synergy.
For this reason, the S.A.W.™ system may be rightfully classified as disruptive technology in Post-Production commerce, and the next sections of this chapter will elaborate on the system’s creation, inner workings, and benefits to the industry and customers.
The stages of retouching industrialization
Pixelz has industrialized post-production by applying lean principles and data science to what was essentially a cottage industry.
Phases of Evolution
Phase 1: Cottage Industry
Jobs are run in emails, spreadsheets, free PM tools. Customer knowledge is stored in personal experience (an individual’s memory). Editors are expected to have all technical skills when hired and then learn individual clients on the job. No structured training, very little management and back-office. QA is visual by eye test. No subcontractors. Difficult to scale up.
Phase 2: Enterprise with a Production System
Jobs are run in a production system, either proprietarily developed or by adapting standard systems like ERP. There’s a customer interface for managing orders. Less experienced editors are hired and then trained for 3-6 months before contributing. Editor churn and long onboarding means up to 100 editors are in training all the time, impacting client onboarding times and quality. Customer requirements are stored in the system, but knowledge is stored in both individual and system memory. QA is still the eye test. Subcontractors are used, but transferring knowledge and structure is hard.
Phase 3: Global Company with Assembly Line
Jobs are run in a proprietary system with an extensive customer portal. All customer knowledge is transferred to the production system. Job planning is done algorithmically by the system. Editors are hired and trained to specific steps and can contribute in 1-2 weeks. Training is structured and assigned algorithmically by performance analysis. QA is standardized and performed by both computers and eye. Subcontractor use is no different than running with own employees.
Phase 4: AI-Powered Assembly Line
JAt least half of all editing is performed by AI and Neural Networks which are progressively implemented into the assembly line. Humans validate output and specialize in advanced retouching steps. Data analysts comb logs looking for improvement opportunities. AI bots are spun up and down at need for infinite scale. Requires significant development resources.
At traditional post-production companies, regardless of their size, one image editor performs all the editing on an individual image from start to finish. We consider this either “Phase 1” or “Phase 2” on the evolutionary tree of Post-Production.
There are many companies in Phases 1 & 2, but the leap to Phase 3 is rarely made. It is our industry’s version of the “Great Filter” and requires changing every process on which the company heretofore operated.
Phase 3 is the introduction of an assembly line and proprietary production system. Customer knowledge is transferred into the system and standardized, job planning is done algorithmi- cally by the system, and so on.
Phase 4 is the future of virtual factories, combining Deep Learning and intelligent automation systems with human workers to achieve superior quality and speed of delivery, and allow for personalization at a mass scale, which cannot be achieved without such intelligent systems.
S.A.W.™ - Specialist Assisted Workflows
The system we built to enter into Phase 3 is named S.A.W.™, Specialist Assisted Workflows. It’s a hybrid system that combines the best of machine and human in a format familiar to anyone with factory experience.
The retouching process is broken down into component steps that are handled by specialist editors and automated processes on a virtual assembly line with automated traffic control. Those editors may be sitting next to each other, or they may be in completely different coun- tries and hemispheres.
Specialization has lead to continuous, measurable improvements in consistency, quality, and speed to market. The isolation of steps has also allowed us to study discrete portions of the retouching process in order to improve efficiency and quality as well as automate over 50% of our editing through the use of artificial intelligence and other scripts.
The Journey of a Photo
The assembly line in practice
As stated earlier, at a traditional post-production company, one editor does all the editing on a particular image, following the instructions of the customer.
At Pixelz, individual Photoshop experts focus on “skills” or "steps" of the editing process.
One Possible Image Route
For example, a customer wants:
- The image cropped to a particular specification.
- Layer Mask to remove the background / create a transparent background.
- Retouch the skin of the model.
- Retouch the garments of the model, removing wrinkles, pins, and cleaning up any other imperfections.
This could be 4 separate steps performed by 4 different editors who have specialized in that particular technique.
Actually, only three editors may be required, because Pixelz has built AI technology to do the first step of cropping to specifications. And perhaps only two editors are needed, because an editor may specialize in more than one skill, so on a given day a Pixelz Photoshop worker might do 60 layer masks and 40 skin retouches.
And then there are automated and/or human QA gates after certain steps, allowing us to identify error and correct it before it wastes further work down the line (as well as engineer countermeasures to address the error’s introduction).
Our intelligent virtual factory can even detect when certain assembly line steps need more workers and autonomously decides to reassign a worker from one step to the position our pipeline needs most. This autonomous decision-making is happening every second. Pixelz developers have built internal software for monitoring the state of our pipeline and steps, and a small team of planning-engineers are monitoring 24/7, using tools to manually optimize the pipeline. For example, in some situations the planning engineers may decide to override a virtual factory distribution of resources, or give more/less to a particular step for fine-tune optimizations. The combination of human talent and intelligent virtual factory autonomy is a constant pattern we see in Pixelz for achieving industry success.
This has huge practical implications for our ability to mitigate risk should offices go offline due to disaster or another factor. For example, we were able to maintain full operational capacity during the COVID-19 epidemic despite national and local lockdowns.
When an image is routed to a specialist, they are given directions in a visual, easily digestible format directly in Photoshop. Their next image is loaded in the background so there’s no lag time proceeding to it when the current image is finished, uploaded in the background, and automatically sent on to its next step.
And of course every action is logged, so we can measure and identify opportunities for improvement.
Those logs are even made available to the client in near real time, as well as the step an image is currently on.
"Because our CSR offers a lot of value to employees and the way we treat employees is becoming more and more well known, we are able to continue to recruit some of the best talent in the cities where we are located.
We need that special talent in order to operate the way that we do: we rely on the professionalism and expertise of our managers and our team leaders and staff, and the only way to get that is to make sure that they feel valued and comfortable and are proud of coming to work."
Terabytes of data
Those logs are connected to a database via the Photoshop API, and it is massive.
"With several terabytes of data accruing every year—Pixelz edited over 6 million images in 2019—data is becoming the new currency at Pixelz."
It offers us unlimited opportunities for automation, optimization, growth, and continued evolution.
And that’s the whole point, right?
We’re not collecting data for its own sake. We’re collecting it in order to take improve- ment actions.
So an entire ecosystem of Pixelz software has developed around that database, from S.A.W.™ to client-facing tools.
Let’s take a quick look at some examples.
Jan 2018 - April 2020
A Highly-Evolved Post-Production System
Expected timings & image classification
Efficiency in image editing at Pixelz is measured in time.
In order to identify improvement opportunities, measure the success of implementations, and route images intelligently respective to deadline, we need to know how long it takes to perform an editing step.
Accuracy poses some obvious challenges. Some images are complex, others quite simple, even when coming from the same customer. The same step can have wildly different levels of complexity, depending on the product and the shot. A wicker chair shot on a textured background, for example, will be far more difficult to mask than a solid wooden chair contrasted against a studio-quality white background.
Pixelz tackles the issue of accurate timing estimations via "image classification."
In short, all incoming images will first go through classification steps before proceeding to editing. We will grade the image on six different levels of complexity, like whether there’s a model or not, if so how much skin is showing, and several other proprietary classifications. One of our most useful algorithms for image classification counts the number of "points and corners" that exist on the edges of the foreground object -- for example, an image of a model may have 35 anchor points around its edge, and a wicker chair can have thousands of points, indicating the chair is of higher complexity and therefore we should expect more time to edit this image.
All these classification efforts pay off when we have millions of images and their data in our database. A data scientist can analyze it and determine extremely accurate "expected timings" based on complexity score, AI mask quality, and our other classification markers.
Footwear Efficiency - Vertical Bar 2019
We use these expected timings in several ways.
- Editor expectations - We show the expected timing to the editor, so for example they know we don't expect 5 minutes of skin retouch, but only 40 seconds of skin retouch—note that this may be a marker of retouching style, not quality. For example, a more natural aesthetic correlates to less skin retouching time.
- Staffing - We use expected timings in our staff planning and forecasting. That means determining how many of each specialist we need to be online each day in order to meet deadlines.
- Invoicing - We use expected timings when quoting Enterprise customers, and also for intra-company accounting.
- KPIs - We use our data to create new metrics and KPIs, one of which we call "Efficiency Score." It’s the Actual Working Time divided by the Expected Working Time. Divergences alert us to problem areas.
All of the above are valuable. Just to further demonstrate, consider that we can use the EfficiencyScore metric to:
- Measure and analyze operational excellence
- Compare ROI of projects based on their potential to impact EfficiencyScore
- Identify workers who need help to improve their efficiency
- Establish new management KPIs
- Create AI models
Let’s take a look at some other practical examples of the data-related benefits we’ve experienced from instituting our assembly line.
Automation & AI
We’ll go deep on this in Chapter Two, but automation and AI development is probably the greatest single benefit of using an assembly line for retouching.
AI is consistent, fast, and cost-effective. However, it is not as “reliable” as humans.
That means if you give a human 1000 different images and ask them to mask each one 10 times, they’ll be able to do it. But all the masks will be a little bit different—including when masking the same image over and over again .
If you give the same task to an AI, when masking the same image it will draw the exact same mask each time, and it will mask all of them much faster and at lower cost than a retoucher. However, there’s a much higher probability it misses on one of the images and cuts off an arm or a leg or something no human retoucher would do.
For AI to be successful, you need to have a clear and simple task definition, be easily able to generate datasets for training, and keep humans in the feedback loop.
S.A.W.™ is perfect for AI development. Because steps are isolated, an AI can be trained on a very specific task, like layer masking, and the input will be predictable (we’ll know it’s a product image, and quite often the product).
We have massive, real datasets that our developers can pull for training. Literally millions of images that exactly match what the AI will encounter in production.
And finally, we can use humans to guide the AI before it runs and to judge its quality on completion. For example, we can present the editor at the next step with the AI drawn mask and allow them to accept it, edit it, or reject it entirely.
And all of that can be done incrementally, with AI taking over one step at a time without impacting any of the other steps of the assembly line.
Surfacing best practices
In addition to efficiencies gained from automation, management and training teams can work with Business Intelligence to focus on improving individual steps in the process.
For example, a departmental goal for Q3 might be to improve our Retouch Footwear step.
Retouching Step Improvement
On this step, the editor is not thinking about cropping an image to particular specifications or cutting out the layer mask or working with colors and contrasts and shadows; the editor is purely focused on the footwear itself through the lens of retouching, blemish removing, dust removal, laces fixing, flipping, etc.
One proven approach to improving our Retouch Footwear step is to use the data to identify "master” editors.
So for example, our data team may look at the last 100,000 footwear images to come through the system and identify the top 5% of editors who have consistently had higher quality and faster timings. The data team then compares their techniques to the rest of the team, and quite frequently discovers significant differences—perhaps in tools used, time spent with specific tools, keyboard shortcuts, or even something like zoom level on an image while editing.
Those learnings are then used to create new training courses and management KPIs around the techniques. That training is able to be rolled out virtually to editors around the world who work on that specific step, and completing it may be incentivized or required.
We have an entire Academy team focused on building training courses: for onboarding new editors, opening up new skills to current editors, and retraining editors as we surface best practices or as changes are rolled out to Photoshop and our internal systems. Specialization allows for extremely fast onboarding of new editors, since they don’t need to master every technique before becoming productive.
Saving 10 seconds on an image may not sound like much, but when scaled up to millions of images it makes an incredible difference to our bottom line.
Pixelz staff in one of our cafe areas in Da Nang
Additionally, to drive improvement, we can create data-based bonus incentive programs to reward editors who make particular improvements to their Photoshop techniques and performance.
We absolutely consider our system to be gamified, and we have the data to prove how each new addition to our gamification system has created significant positive impacts on our Production, from performance metrics to employee motivation and enthusiasm.
Retaining talent is a top priority.
The data can reveal skill mismatch. Some editors may be extremely talented at Retouching but not necessarily at Vectorized Pathing.
Specialization allows editors to find their niche and excel.
That has both personal and professional ramifications. Who isn’t happier doing work they’re good at, and to know they’re contributing?
Editors also want to know they’re being evaluated in a fair and transparent way. If KPIs feel arbitrary, or unrealistic, it’s demotivating.
And in a system with achievable gamified bonuses, there’s financial incentive to spend your time on steps you excel at.
A central tenet of Lean is employee empowerment.
You might be thinking, “Doesn’t that get monotonous?”
Instead of 100% specialization on a single task, employees specialize in several types of tasks, which leads to greater system flexibility, better job planning, and efficient usage of human resources.
For example, if there is a bottleneck in the assembly line, or simply no images sent to a particular segment of the assembly line, the employees specialized in that task will not wait idly—instead, the S.A.W.™ system is intelligent enough to detect the idle segment and provide those workers with image work on a different assembly line task (perhaps re-assigning those workers to help clear the bottleneck, or fulfill a high-priority expedited order).
This creates the best of both worlds --- allowing for specialized employees while still providing skill variety and achieving virtually zero idle time of human resources. Furthermore, this system design creates higher job satisfaction in employees by allowing skill variety, one of the core features of job satisfaction in the Job Characteristics Model proposed by Hachman and Oldham, 1976 (citation).
Our specialists are one of our greatest strengths, and we’ve benefited from excellent retention over the years. You can have the best system in the world, but without the right people it’s not going to work. So it’s extremely important for our continued success that photo editors and all other team members thrive.
What does S.A.W.™ mean to our clients?
They don’t necessarily know the name, what’s happening on the backend, or to be honest really care. What they care about is the impact it has on their images and their systems.
And those are unprecedented speed, consistency, cost, and quality. At scale.
The virtual assembly line lowers costs, increases quality, and accelerates time to market.
It’s also plugged into an intuitive frontend UI that simplifies communication, provides near real-time status updates, and gives clients more control over retouching than they’ve ever had before.
The Customer Perspective
To summarize, Pixelz has evolved post-production to a data-driven system with an AI-powered virtual assembly line. Retouching is broken down into component steps that are handled by either specialist editors or automated processes.
Isolated tasks can be measured and improved.
"expected="Expected" Timings" and bonus systems for gamification and employee engagement.
New KPIs and data insights for Production Management to oversee.
Rapid onboarding of new editors
Surfacing “masters” and best practice techniques which can be rolled out to the rest of the organization.
Retention of knowledge within the system
Gradual introduction of AI technology and automation
99.8% success rate on quality
99.7% on-time delivery
3 hour delivery SLA for several Enterprise clients
Full operational capacity during COVID-19
Transparent real-time image status
Lowest retouching TCO
Preview of Chapter 2: AI Layer Mask
In Q1 of 2020, 55.96% of Pixelz images received Layer Masking services. Layer Masking is more time-consuming, by an order of magnitude, than other assembly line steps, consuming approximately 50% of the hours spent in the entire pipeline.
That’s why we’ve been gradually introducing AI into the process for 4 years. Here’s how we’ve done it, where we’re at now, and where we’re going.