This article has been updated for 2025, incorporating updates and exciting new advancements to show how far Pixelz has come and just a glimpse of where it’s headed next!

Artificial intelligence isn’t just for the tech giants anymore; it’s everywhere. Today, anyone can tap into AI tools to automate tasks, boost creativity, and speed up production. That accessibility has sparked a wave of innovation, with new companies emerging every day to explore what’s possible.

We’ve actually been leveraging AI for over a decade, developing our own in-house tools to perfect everything from shadows and clipping paths to image classification (“is it a shoe?”), skin retouching, color matching, and ghost mannequins.

Today, that foundation continues to evolve. We now use AI to power Digital Twins, automate product tagging and descriptions, and even AI models — all with the same goal: helping brands produce beautiful, scalable, and consistent visuals faster than ever.

For years, editing, retouching, and post-production have required meticulous, repetitive work that often left little room for experimentation or deviation from what was standard. Now, AI is rewriting that equation, and you should be taking advantage of it.

AI is practically built for e-commerce image editing (thank you, Google, Microsoft, and Adobe)

If you’re not already using AI to work smarter, you’re falling behind. Automation is one of the greatest efficiencies to be found in any workflow, and that’s especially true in expensive yet repetitive work like product photography, outsourced photo editing, and post-production

In fact, today’s AI are nearly perfectly aligned with e-commerce image editing needs. Why?

  • The best minds in tech are intensely focused on AI image analysis
  • Standard input + standard output = actionable exploits
  • You can use the cloud to autoscale AI

Before we awe (or bore) you, here’s some quick background on what Pixelz is already doing with AI. It’s not just a theory to us, or something to look at in the future. The development of AI is driving our company.

Over 50% automated…

That’s right! We’re over 50% automated in our workflow with AI. We’re deep believers in standardization, automation, and quality prioritization. Why? Because lean production has proven time and time again that it produces better quality, faster, and more efficiently in virtually every industry.

That’s why we built S.A.W.™, a Photoshop assembly line for product image editing. S.A.W.™ breaks image editing down into small steps that are completed by a blend of specialist human editors, AI, and scripts.

So how does it work? How can AI be integrated into post-production?

First, you need something like S.A.W.™. Because we break image editing down into component steps, we’re able to isolate individual processes and train intelligences to contained tasks. Controlling the input is critical: it limits exceptions and other “special” cases that are difficult for an AI to handle, just like humans.

For example, in a product photo, the product is usually in the center of the image, and we also typically have an idea what kind of product a customer will be uploading. That makes it much easier to train an AI to draw a mask around the product. We can also use humans to identify edges before sending to the AI, and to validate results immediately after a step is completed—further training the AI.

Let’s start with a real-life example.

Using AI to remove moles (It works! But nobody wants it...)

Before and after of AI mole removal.

Images are edited in micro-steps by specialists and automation, with all Photoshop activity logged.

One of our earliest functional AIs was trained to remove moles during skin retouching. We used traditional algorithms to first detect skin, then another algorithm to detect “candidates” (potential moles to remove), and then used an AI to classify those candidates.

Mole candidates were determined based on color difference with the skin. Seems straightforward—and that’s why there are existing algorithms for it—but the problem we encountered was that loose hair triggered false positives. The job of the AI was to classify moles and not-moles, and to train it, we fed it 65,000 images of moles and non-moles, sorted manually (fun job!).

After the moles and not-moles were properly classified, we used a standard Photoshop script to remove them.

The project worked! We successfully automated mole removal—but in the meantime, trends changed, and nowadays most of our clients prefer a natural look, complete with moles. But it was mostly a research project anyway, and its success propelled us further down the AI path. And it highlights something important to remember -- AI doesn't always mean a human is replaced; in this case, we developed an AI tool that a human photo editor uses to be more efficient, and in some cases, the work of the AI still needs to be tweaked for perfection, but the net result is a gain of efficiency.

Using AI as traffic control (Toothbrushes to the left, chairs to the right!)

In the Pixelz smart factory, the first AI process is focused on classifying images after upload: basically, looking at a photo and figuring out what’s in it. Much like in our “mole or not-mole” example, but with far broader parameters. Is there a model? A mannequin? Shoe? Bottle? Table? Etc.

Maybe that sounds simple to you (“My two-year-old can tell whether there’s a person in a photo or not!”), But it’s actually one of the single biggest challenges in artificial intelligence. The human brain excels at visual and auditory input interpretation, and what children intuit without any seeming effort can stump supercomputers.

Translating street signs with your phone and a camera app? Image classification. Spotting warning signs of cancer in x-rays? Image classification. You get the idea.

So yes, Pixelz’ AI primarily classifies images (using an architecture known as “Inception” in a “GoogLeNet” designed by—you guessed it—Google). The first time it does so is during a stage we call “image preparation.”

Image preparation is the first step all images go through, and it determines the future steps for each image. The bulk of it is categorization: as our COO Jakob Osterby says, “If you had the same input every time, it wouldn't be a problem because you know the object. But it could be a chair, it could be a toothbrush, or it could be a jacket. If it’s a jacket, there’s a big difference between leather and fur.”

Fortunately for us, we do have some expectations regarding the input because customers set up specifications for products ahead of time, but even so there’s still a lot of variance.

“If there’s a prop, if there’s not a prop, that’s a huge difference in our workflow,” says Jakob. “Maybe a template doesn't have a retouching step. But if there's a prop in front of the object—could be a hanger, a fishing line, or a bag stand—we're going to remove it.”

This has added complexity to our Image Preparation process, in which we have developed in-house over a dozen image classification outputs, for Prop Detection, Model Detection, Skin Detection, Image Complexity, and more. For a bulk image editing service, understanding our images is important to us.”

The AI and human hybrid (Not a cyborg)

AI not only sorts images by type during image prep, it assigns complexity scores based on things like facial recognition, background contrast, points in a layer mask, and the presence of skin.

That score helps to determine costs, timelines, and which AI or specialist editor an image is routed to.

“Having AI in classification, classifying something that helps another AI model algorithm perform something later on, I think that's a beautiful thing,” says Jakob.

One of the things the AI detects is contrast against background. For example, a black leather jacket will have a higher contrast against a white background than a white fur coat. The black leather jacket may be routed directly to an AI for automatic background removal, while the low-contrast white-on-white image is routed to an editor to draw a trimap, then masked by an AI, then to another editor to polish-off the mask.

The data we record during the white-on-white background removal is stored and used to help train the AI later, with the goal of improving its ability to remove the background on low-contrast images. Update 2025: our AI now works very well on low contrast images!

Trimaps to find the way (Masking for the win)

That’s a pretty simple example, and there’s lots of software that can do a good job when images have sharp edges and high contrast (including Photoshop itself). Where it gets more difficult to draw a mask is when colors are subtle, products have lots of edges (like a mesh chair back, or jewelry chain), and when models are involved.

Human-drawn trimaps assist AI-drawn layer masks. Trimaps tell the AI where the edges are, and humans can make these trimaps in a few seconds. However, in 2020, we built a proprietary AI to make the trimaps, and a human Quality Assurance editor checks the work to make sure it is perfect (or fixes it when necessary).

Are you ready for some shocking news? Most models have hair. Lots of it, artfully styled and dramatically tossed.

Talk to any product image retoucher, and they’ll tell you that masking around hair is one of the most time-consuming parts of their job. Fine strands flying everywhere, crossing lines and adding tons of soft new points to draw around.

“I think the big push in machine learning is going to be a hybrid model between AI and human,” Jakob continues.

“For masking, the idea is that when the image comes in, we have somebody—right now (2020), an editor, but we’re training an AI—to draw a rough path around it. It takes two seconds in the interface, and it helps the algorithm detect the edge and distinguish between props and the actual product. Then we push it server-side, AI removes the background, and it’s sent back to an editor that validates and maybe refines it. Over time, the AI should be able to do more and more.”

Drawing a rough path around an image is part of generating a “trimap.” A trimap in our system breaks an image into three segments: the foreground (keep it!), background (delete it!), and the border area outlining where the mask will be drawn.

Gains, gains, and more gains (or how to mask a million images in 30 seconds)

As AI-driven workflows become standard across e-commerce studios, the efficiency gains are becoming super clear. In the case of AI masking, we’re already seeing:

  • 15x faster masking- There’s a lot of variance, since masking time is product-specific. A human might take anywhere from 20 seconds to 30 minutes to mask a product, while AI ranges from near-instantaneous to 30 seconds (or longer for both humans and AI on something like a bicycle). At present, most AI masks need additional adjusting afterward, but the human finishing off the mask has a huge head start! Edit 2025: Now 60% of our AI-made masks are ready with no additional work — and even the remaining 40% saves 50% of the usual editing time. Which can lead to…
  • Savings on production costs- We spent a lot on research and development, but now that AI masking is up and running, we’re seeing significant ROI. As an added bonus, our editors have more time to become proficient at more advanced retouching tasks, adding more value to our clients.
  • Infinite scale- Our neural networks run in the cloud. Additionally, a larger image volume means a more accurate AI, as each edited image helps train it.

On that last point—and this is where you really begin to see the power of AI—we’ve made our masking AI autoscaling on Amazon servers. As Pixelz CTO Janus Matthesen puts it, “We automatically spin up new servers as we get more images and scale down again when we are done. That means we can decide how fast we want our images to move through the step by simply spinning up more servers. The AI spends about 30 seconds per image, so in theory (with enough servers), we can complete any AI Mask workload in 30 seconds.”

That's right. Whether 10 images, 10,000, or 1,000,000, they can all be masked in 30 seconds. And as of 2025, scratch the 30 seconds; we can now do it in 12 seconds, and with better precision and higher quality. However, there is a tradeoff with speed and cost, and we don't always need our AI to work so fast, so we control our speed to meet our production demands.

Goodbye, bottlenecks.

AI, Color, and the Art of Getting It Right 2025

When it comes to automation at Pixelz, rest assured, we haven’t been resting on our laurels. As Dr. Sébastien Eskenazi, R&D Director at Pixelz, puts it:

“Since our last automation update, we’ve been busy building. We now have around 30 automations for image retouching and related services like QA and rejection analysis. Most are AI-based, and the few that aren’t still rely on some pretty insane computer vision algorithms.”

Among those new developments, one innovation stands out: our AI for color matching and color changing.

“It stands out for two reasons,” Sébastien explains. “First, our customers really need it. And second, it’s groundbreaking on multiple levels.”

Note: All retouched images in this section are fully generated by AI. The only human input was a single click per image to indicate which color to change (which, as Sébastien notes, we didn’t bother automating).

Why color matching matters: “Who likes returns?” Sébastien says. “Color matching ensures customers see the product exactly as they’ll receive it — reducing returns and improving trust.”

Example of Color Matching

Why Color Changing Matters: “How many times do you really want to shoot the same product just because the color changes?” Sébastien asks. “Color changing saves time — shoot one color variation, and let AI generate the rest. It also ensures perfect alignment across images, which looks great on product pages when switching between colors.”

Example of Color Matching

And why is it groundbreaking? Color, as it turns out, is no easy science.

“Any color expert will tell you that color is complex,” Sébastien says. “Between color spaces like RGB, Lab, and YUV, and profiles like sRGB, AdobeRGB, and DCI-P3, plus factors like lighting and shadows, it’s a challenge for humans, let alone AI.”

To tackle this, Sébastien and the team built a model capable of understanding both the target and source colors, using one image as a reference to recolor another.

“The AI can identify not just where to recolor, but how much to recolor for each pixel,” he explains. “That alone required a completely novel AI architecture.”

Example of AI taking one image to recolor another reference image.

A new way to see color

Even then, the team didn’t stop there.

“Retouching neons or moving between very bright and dark colors is where most models fail,” Sébastien notes. “Most computer vision color models rely on color clipping, which leads to ugly artifacts. So, we built our own model for colors — one that’s so different that the simplest way to describe it is to say we see colors as circles.”

What AI sees

The result? “With this approach, we can even handle color retouching for editorial images,” Sébastien says. “No issues with highlights, shadows, or neons. No oversaturation or artifacts.”

And the best part? “We deliver it as a Photoshop adjustment layer,” he adds. “If the AI makes a small mistake, you don’t have to throw the result away. Just make a quick fix, and you’re good to go.”

Example of Editorial Color Changing using AI

Sébastien is the first to keep it real: “It’s still AI — it doesn’t work perfectly every time,” he admits. “But when it does, it nails the result about half the time. Pretty good, isn’t it?”

Not all automation is intelligent (Scripts aren’t AI!)

Let’s do a quick rewind before we dive into some more technical aspects of AI.

First, start with a basic fact: not all automated processes are intelligent. To truly be intelligent, a process must be capable of learning.

For example, we have many automated processes that are not intelligent. They’re bots, scripts that work on tasks like “Apply Path,” “Apply Mask,” “Auto Preparation,” “Auto Finalize,” and “Auto Stencil.” They’re sophisticated, but limited in scope by their author’s imagination. They’re not going to get better at their tasks without human intervention: for example, when a scripter sees we need new handling for a different product type and goes in to manually modify the script.

That type of unintelligent bot is where Pixelz began automation years ago.

“The origins of it?” Jakob says. “We had a step where we only needed to import a layer to a step. We had people sitting and doing that, just pushing a button and then waiting for a script to run, ten or fifteen seconds. It was not only time-consuming, but a hell of a boring job.”

AI is different. An artificial intelligence is capable of learning, primarily by guessing and learning from the results of trial and error.

How to scale with AI (Use the cloud, duh)

AI doesn’t become intelligent without education, and for that, you need heaps and heaps of data and a lot of computing power. Just ask our CTO, Janus Matthesen. “When we train our AI models, we need to train on millions of images—a quite time-consuming process,” says Janus. “To find the optimal configuration, we also need to tweak weights and hyperparameters to find the right combination. We have moved this work to the cloud, and we have been able to scale it and test multiple configurations at the same time.”

“We used to use local servers with many GPUs, but now that Amazon has released the P2 and P3 EC2 instances, we have moved this work to the cloud,” says Janus. Update 2025: we now have extended our infrastructure to Microsoft Azure in addition to AWS and recently acquired a new on-premises GPU server with 8 very powerful Nvidia RTX 3090. “Bringing down the time we use to find the right configurations and scaling our AI model training is a competitive advantage for us.

AI just keeps getting better and better

As we’ve stated before, Pixelz plans to automate 50% of post-production by the end of 2019 (Update 2025: We succeeded, and we're not slowing down. We can now offer retouching in less than 3 hours, and for Flow customers as little as 10 minutes!

We could easily get there just by applying our existing AI approach to additional areas (more specific retouching jobs like the mole removal example, or auto-cropping, or selecting primary images for marketplaces, etc.), but in truth we’re anticipating more revolutionary advancements.

For example, a recent major foundation resulted from the publication of two October 2017 research papers. The AI we (and everyone else) are using are “Convolutional Neural Networks” (CNN). CNN are what people usually mean when they refer to “Deep Learning” or “Machine Learning.” (Update 2025: We are way beyond these ancient concepts from 2017 :D Right now our work focuses on improvements on EfficientNet and Unet3+ networks. We are also keeping an eye on the developments of transformer networks applied to computer vision.)

I don’t want to get too deep in the weeds, but simplistically, CNN does a lot of classifying and counting of objects without identifying their relationship in 3 dimensions. As a result, a major challenge for CNN is recognizing similar objects with different poses (position, size, orientation).

What that means for product photography is that CNN aren’t always great at recognizing products that have been rotated, have atypical zoom, or are photographed from atypical angles. Or, more commonly, products that don’t have fixed shape—like necklaces, with their wide variety of styling, chains, and charms. We’re able to mitigate those challenges by controlling the input and training with lots and lots of images.

Identifying pose in itself is huge and valuable, but a CapsNet also needs much less data for training. That’s encouraging for us, as it would allow us to adapt CapsNet AI to more areas more quickly, and hopefully, an increased understanding of image content leads to even greater precision.

The GenAI part of the Equation 2025

While many people slowed down this summer, we definitely didn’t.

In July 2025, we launched Digital Twins, our newest AI solution that brings together real model partnerships and generative technology. It’s built to help brands create campaign-quality visuals faster, without compromising on consent, consistency, or creativity. It was a big milestone for us at Pixelz!

As for the rest of Generative AI, it has been all the rage and quite the hot topic within e-commerce and other industries and for good reason. From large language models (LLMs) that generate text, to image-based AI creating visuals, and even tools that produce video and sound, we’ve entered a new era of creativity powered by machines.

Today, as a whole, we’re at the stage of multimodal AI systems that can take in various kinds of inputs (text, images, sound, video) and generate multiple types of outputs in return.

These models don’t rely on a single AI but rather a combination of specialized ones working together.

Digging deeper, Sébastien walks us through how advances in input preprocessing and reinforcement learning are shaping the next generation of AI.

“New image preprocessing techniques, particularly tokenization schemes, have allowed image editing to reach new levels,” Sébastien explains. “These approaches make it possible to process larger images without losing too much detail. A great example is Qwen2-VL, which introduced variable-length image tokenization to handle high-resolution images. In simple terms, before, images had to be resized to fit within a fixed token limit. Now, that restriction is being lifted.” He also points out another challenge: images are two-dimensional, but AI models process information as one-dimensional token sequences. So how can an AI truly ‘understand’ spatial organization? “That’s where RoPE positional encoding and its many variants come in,” Sébastien says. “They essentially provide metadata that helps the AI know where each token belongs within an image — and can even extend to 3D data like video.”

While reinforcement learning may not be central to image retouching, it remains a fundamental training technique. As Sébastien notes, “It’s been around since the 1990s and was key to systems like AlphaGo. It teaches AI to learn a sequence of actions even when the results aren’t immediate.”

All three innovations: image tokenization, positional encoding, and reinforcement learning are foundational to how AI operates today. And as Sébastien concludes, “They’ll continue to improve, shaping the AIs of tomorrow. How far they’ll go? Only time will tell.”

Understanding these broader advances is what makes new frequent innovations possible and what drives us to keep exploring what’s next.

How to use AI in your retouching workflow

I hope you enjoyed learning about how Pixelz uses AI to retouch product images and where GenAI is going, and I hope you’re convinced of its value (and a little less scared of our future SkyNet overlords).

If you’re looking to incorporate AI into your own local retouching workflow or even our Digital Twins, feel free to ask us questions! Comment, email, or hit us up on social media.

We like to hear the challenges other people are encountering, and problem-solving is fun. We don’t

Of course, not everyone has the time and resources to dive into AI headfirst. We do, so you can always test drive our system first.

Thanks for reading!