And frankly, much of it is bullshit. Pure marketing.
Everything autonomous is not AI, and AI isn't always better. Writing a script to automate a process doesn't make it intelligent, artificial or otherwise.
Even where AI is possible, using neural networks isn't always the right approach. It's costly, needs to be applied narrowly, and often is less efficient than using a less buzz friendly—but more effective—method.
So why is "AI" such a powerful label?
Because properly applied, AI really is transformative. We know: we've seen incredible gains using AI to automate product image editing.
So when and how should AI be used in order to be effective? How are those decisions made? Who's making them?
To find out, I interviewed Sébastien Eskenazi, Pixelz' in-house R&D Computer Vision Scientist.
Sébastien's a bit of a renaissance man, and his background reflects it. Consider his education for a quick example: in France, he completed a Masters with a double major in Aeronautical Engineering and Mechanical Engineering, then went on to get a PhD in Information Technology.
Before Pixelz, Sébastien was a project manager for automotive software: making tools for self-driving cars, basically. Before that he was algorithmically comparing digital documents to uncover forgeries.
At Pixelz, he's been hard at work developing computer vision algorithms and neural networks to push the boundaries of retouching automation for tasks like drawing layer masks, smoothing garment shapes, and cleaning up photo backgrounds.
But enough from me. Let's hear from Sébastien himself about technology, AI, and why retouching is such a fascinating field for computer vision scientists.
Full Interview (Sébastien on AI in Retouching)
- Sébastien Eskenazi
- R&D Computer Vision Scientist
- PhD Computer Vision
La Rochelle Université
What's your tech background?
I'm a computer science engineer. I had a PhD in France for that, and after that I went to Vietnam to work for a big company called FPT software. In the U.S. I guess it's the equivalent of AT&T, maybe something like that. I was working for a customer who was making electronic chips, system on chips, for cars. I did a bit of work for computer vision, basically, and neural networks.
I was working for a customer who was making electronic chips, system on chips, for cars.
Did I see that you actually got a double degree in aeronautical engineering and mechanical engineering?
Yeah that's true too, that was a while ago. I finished high school and then I did three years of pure math and physics, and then I moved on. That's the standard course in France for engineers. And then I moved on to what's called an engineering school, equivalent to some university I guess. There I learnt a lot about aeronautical engineering, aerospace engineering, and mechanical engineering. I got two degrees like that, a master's and an engineering degree.
I learnt a lot about aeronautical engineering, aerospace engineering, and mechanical engineering. I got two degrees like that.... then I moved on to PhD.
So for a while you were interested in flying, and then cars, is that right?
What happened is I went looking for jobs outside of France, but any jobs related to aeronautics or space outside of France is defense-related, so you have to be a national to be allowed to work on this. So it's impossible to find jobs outside of France. Then I moved on to PhD, kind of career reboot, because I was already a bit knowledgeable about IT. The thesis itself was related to security, IT security. Some people can abuse digitization services. For instance, they submit fake documents, and when they're digitized no one really checks them. So finding a way to check these documents, that's what I worked on.
AUTONOMOUS VEHICLE MODULES
Did you actually work on the autonomous driving aspect with cars?
Not really. I worked on tools for tools for people who would make autonomous cars.
Little modules that are abstracted out?
Exactly. Well, people make neural networks but they don't make neural networks out of nothing, they need a framework. So I was working on these frameworks basically, on these libraries that make neural networks. And I did neural networks so I was working on that. I also wrote a white paper on a proposed architecture for autonomous cars.
I was working on these frameworks basically, on these libraries that make neural networks... Every time you want to implement a functionality that is automated, you need to think about the big picture and all the aspects of it.
Every time you want to implement a functionality that is automated, you need to think about the big picture and all the aspects of it. When you're talking about autonomous cars, it's so complex, you really need to think about the framework and how you're gonna put it together. And it's really the same thing here at Pixelz actually, every time we automate something it's not just a piece of software on its own. You've got humans, you've got photo editors, you've got customers, you've got customer accounts, you've got lots of people involved. We need to make sure that the software we make works well with all these actors around it. So that's the challenge, you know?
How do you approach AI development at Pixelz?
I am in the research and development team. We kind of work on whichever topic we want, but we do have priorities based on which tasks require the most time from the photo editors, and which tasks we can automate, in order to optimize the company of course. This gives us priorities. We have skills ["skills" are specific tasks photo editors perform on our assembly line], and so for each skill we know how long it takes, and based on what the skill is about we know if we can automate it or not.
I prefer to focus on the whole process, because if I make a tool but it's not used by the photo editor, it's useless.
We start from there, and each one of us picks a topic and says, "Okay I'm gonna automate this one," or, "I'm gonna try to improve this one." So we start like that, and then each one of us takes care of the whole thing. Depending on how you approach the system, some will just focus on their algorithm. For me I prefer to focus on the whole process, because if I make a tool but it's not used by the photo editor, it's useless.
Then I'm in charge of actually looking at the whole process in which my tool will be involved. I discuss with several actors, like Janus [Pixelz CTO] and the platform team. I also sometimes go to check with the photo editors themselves, observe how they use the tool or not. Also with the data science team to check the tool performance. We really look at the whole picture.
Are your tools always neural networks?
No, definitely not. So you've got a functionality you want to implement. There are two issues with any functionality: one is, how easily you can express it in a way that's understandable for a machine. If it's very difficult to express, then a neural network will be more suited than a pretty fine algorithm. But sometimes algorithms are more efficient than neural networks, because neural networks need a lot of training data and they can be biased with the data set that we use for training. Sometimes plain computer vision algorithms are better suited. They're still not very easy to implement, we're still talking really high tech stuff. Not many people could do that.
There are two issues with any functionality: one is, okay, 'can you express your functionality?'' And the second is, 'how many corner cases are you gonna have?'
Especially like when you do photo editing you deal with such a variety of images that that's the second topic. One is, okay, can you express your functionality? And the second is how many corner cases are you gonna have? The number of these corner cases, these things you see rarely, that also makes it difficult. Based on these two things, how easy is it to express and how many corner cases you have, you can decide which approach is best like should I use a neural network, or should I use a specific algorithm, or a mix of both? For instance we can use an AI to detect some parts of the image, and then you're gonna use computer vision algorithms like some kind of very advanced filtering (filtering is very basic), but some kind of very advanced filter to process the regions that have been detected by the AI. So sometimes it's a mix of both also.
Some other things we can look into is you use computer vision to provide more elaborate and more useful input to the AI. You can do a combination of those also.
AI & COMPUTER VISION DEFINITIONS
What exactly do you mean by "computer vision?"
You know there is marketing and then there is marketing, it's not polite, but it's called bullshit. There is a lot of that unfortunately. Oh, okay, so we need to clear two things. One, AI is nowhere near intelligent. It's definitely not intelligent.
What is AI? It's you take a bag of grains of sand, and you throw it on the ground. At first what you get is a big mess. And then you take a training data set, and that training data set is like a small finger that's gonna arrange the grains of sand one by one. And if you do that long enough then at the end you end up with a pattern that's meaningful. This is AI, and this is what you train, and then with this pattern it means something.
AI is nowhere near intelligent. It's definitely not intelligent.... AI is just a bunch of parameters that we have optimized, and it's not clever at all.
So an AI is just a bunch of parameters that we have optimized, and it's not clever at all. It has no ability to learn after it's been trained, except in some cases that have been made specifically for that. It will not create anything by itself, although some networks have been kind of customized to do similar things to that. So it's not that clever.
Then, we have many other algorithms that are also not clever but still very, very useful. We've got, what's called classifiers. And AI will take an image and tell you, "This is a dog," right? But before AI, we could also do that actually. We don't need a neural network to do that. It's just not as good.
So what we do is we take the image, and then we will apply let's say a set of filters. A filter like we're looking for patterns basically, certain patterns like lines, like circles, other stuff like that. So we apply these filters on it, and we get from the output of the filters what's called "features." Then we do some kind of mathematical processing on these, and in the end we can obtain a value that will tell us, "Oh, this is also a dog." So you don't really need a neural network to do that, it's just neural networks perform better.
'Computer vision' is image processing.
"Computer vision" is image processing. It includes all these branch of algorithms that are not neural networks, as well as some neural networks. It's more generic. So it includes all kinds of filtering, it can be color filtering, it can be filtering like you know blur filter? This kind of image processing is computer vision. In photography, you know that lens introduces a distortion. We can remove this distortion with computer vision, we can do that.
For instance, when you have several images of the same object, we can make a 3-D model of this. You don't need AI for that, you need computer vision. Computer vision is all about this, basically. So we have many algorithms. For instance, it's capable of identifying patches of uniform color, which is called super-pixels. This is computer vision, it's a whole set of algorithms, not just AI.
So AI is kind of just a catch-all for anything where you're using a neural network to do your filtering. And computer vision is the rest of it.
Yeah, it also includes AI you could say, but yeah. That's exactly all the rest of it. It's more than filtering, you can do classification, you can do segmentation. So classification is like you have an image, you say what's in it. Or you have a zone in an image and you say what's in it. Segmentation is cutting the image into pieces based on what they mean. For instance, you could try to segment the background and the foreground, or you could try to segment the skin, the blue shirt, and the pants, and the background. This is segmentation. You can do it with AI, and you can also do it with computer vision in some other ways.
So we have all these things.
WHY AI WORKS FOR PIXELZ
At Pixelz, you're not just writing modules, but actually developing applications that are gonna go straight from you to the editor?
Yeah, right, it's exactly like that. There are many, many things to enjoy at Pixelz. One is, yes, definitely, I'm making an application that's actually used by the end user, so that's really nice. What we develop is tested before going into production but then it goes into production and then photo editors using it. Something really nice also is with Pixelz, we have the means to do what we need to do. For instance, to train a neural network you need a big GPU—you know, a big graphic card with whatever is needed. And some companies could be cheap on that but at Pixelz we actually have the means to train the networks we need, no matter their size. And so that's really nice.
What's really nice also is because of S.A.W. [our photoshop assembly line] we can get any training data set we want. And that is just so beautiful.
AI is super expensive, for several reasons. One is, GPU computing power is expensive. The second thing is, if you want a good data set, that is super hard to get and usually super expensive.
So AI is super expensive, for several reasons. One is, GPU computing power is expensive. The second thing is, if you want a good data set, that is super hard to get and usually super expensive. Because to get the training data set you need to collect tens of thousands of images, and you need to annotate these images properly. Make manually what is the expected result, or semi-automatically if you want. But you need someone to check that these results are really what you expect.
At Pixelz, because we have S.A.W. and the photo editors are already doing it manually, well we already have our data sets done for us.... If you don't have this production line, there is no way you can compete with what we do at Pixelz. Literally.
So you have a good training data set, and this is very, very expensive. But for us at Pixelz, because we have S.A.W. and the photo editors are already doing it manually, well we already have our data sets done for us. We just need to branch into the production line and take out the images we want. I want a data set, I can get 5000 images every day, so that's very, very nice. Sometimes more.
I hadn't thought about the luxury of that.
Yeah well it is a real, real luxury. In photo editing, if you don't have this production line, there is no way you can compete with what we do at Pixelz. Literally. I know it's promoting Pixelz too much, but really this is one of very, very strong point because for AI you need a training data set, and a large one. We have the best tool to get one, so it's there. And data's really useful.
And the other thing I really enjoy a lot is we have an impact on the bottom line of the company. I mean, if we automate something let's say I get 10% off the time to make a skill, that means it costs us 10% less for this skill. That is really nice also. There are not many companies where your work has a direct impact on the company's bottom line. So that's really nice.
Any big things on the horizon?
Well, we are doing quite a few things right now. Tuan is working on the background retouch. So he's cleaning up background automatically, and that is really, really nice. All the dust you have in the background that is manual removal right now, he is expecting that for 80% of them we can do that automatically. That would be really nice.
Luyen is working on a classification task. The idea here is that by making a data set and training an AI for it we will be able to identify what is the content of the images in order to identify the best processing for it. That will be very useful for us, it's like an enabler for the other tools.
[AI Mask] is a lot more powerful when photo editors are able to do the trimap, check the output, correct the trimap as needed, almost in real time.
Hieu is working on making the AI mask work much faster. It can become a tool that is a lot more powerful when photo editors are able to do the trimap, check the output, correct the trimap as needed, almost in real time. So that will be much faster for them.
Tung is working on the cropping tools. Some customers request that we crop some of the images, crop the bottom under the knees or under the waist or some other identifying words; the product, the character, and so on. So he's working on automating that.
And right now I'm working on releasing the next version of the AI mask, which will be a big step up. Six months ago I identified some issues which I think already doubled the efficiency of our AI mask. With the new release I plan to double it again.
AI & PERSPECTIVE
If you look at AI in marketing, you'll see it frequently in CRM systems. People don't usually apply it to photo editing because one, they think it's super hard to do especially with the quality requirements that we have. And it's true, but the first problem they have is they don't even know what the quality requirements are.
That's one of the biggest problems for a neural network is you need data sets, you need lots of things, but you also need what's called an "objective function." So you need to tell it what to do. That requires also to measure its performance. I mean, photo editing, how do you measure the performance of someone who does photo retouching? That's kind of tricky to do.
So it's not always possible, actually, but for AI mask for instance it is possible, because we have the mask and we know how far we are from it, either from transparency point of view or for an edge distance. In that regard we have really good performance which is nice.
It offers another perspective on your work, and this is very interesting. The idea of measuring the quality of photo editing, or even the consistency of your photo editors.
Basically when you start using automation in AI, there is an automation part of it which is beneficial. There is a cost part of it which can be very detrimental, especially if that's not your main line of business. But there is also the fact that it offers another perspective on your work, and this is very interesting. The idea of measuring the quality of photo editing, or even the consistency of your photo editors. Probably never occurs to many people, because they just don't think or don't care. Actually they do care, they just don't know about it.
For instance, when someone makes a mask, if the mask goes out of the product a bit, or inside of the product a bit, and it's not right basically. It is very difficult to quantify how much right or not right it is. Then if you want to have statistics and to measure that, it's even more difficult.
We were able to do that for AI, and I think that brought us some very interesting statistics also on the performance of our photo editors. So that's kind of the useful things we do.
USING DATA SCIENCE TO DISCOVER AUTOMATION OPPORTUNITIES
Something else we do also at Pixelz that I'm a bit involved in, not really through the R&D team itself, but we have a Data Science team.
What they do is basically help us understand where we're at. To give you an example, we have the skills and we want to improve how much time it takes for them. Part of that is improving the automation but in other ways improving the training for the photo editors, identifying which strategies work better for the skills or not. And they help us do that.
For instance, they work in conjunction with an incredible expert on Photoshop. So the data science team they work with him to identify what are the best strategies for different skills. I worked with them on this. So our pricing is based on how long it takes to do a skill, of course, but inside each skill we have what's called operations. So we have the big pipeline which is broken down into skills, and the skills are broken down into operations. Some customers request some operations and some others not.
So the time taken for each operation actually has an impact on the cost. The board of management of directors came up to the data science team, and I was in the discussion at the time, on, "Okay can we get the time for the operations?" And we don't have this logged accessibly so we had to find a way to do that. And that's kind of the job of the data science team: find information.
I think this approach of having AI development and data science all working together is really helpful for a company to improve the efficiency.
I think this approach of having AI development and data science all working together is really helpful for a company to improve the efficiency. It's really crazy how much efficiency we can improve like that.
Any final takeaways?
Actually, there are several things. People are scared of AI and automation because they think it's gonna take their job away. That's kind of true in the sense that what AI is doing a human will not be doing it. But that's also wrong in the sense that there are many things we don't have the time to do right now because we don't have the resources for it. As AI frees our time we can do more stuff that's more interesting.
As AI frees our time we can do more stuff that's more interesting.
For instance, AI mask is a rather boring task, and once it's automated our photo editors can focus on more advanced retouching, which is more interesting for them, and also more rewarding for us, for the company, and the customer. I think that's also a good side of AI and automation in general.
Yeah, I agree. I think it was true with most mechanical operation and I think it's true with ... we call it AI but it's the same thing, just a digital machine here.
You also see that many people think that it's impossible. Things are impossible because, for instance, making a nice picture is a creative action. Actually, how much of that action is actually creative? That's part of the questions we can ask ourselves also. And all the parts that are not really creative then we can automate them.
The creativity is done by the people who take the picture, the photographer, the filmmaker, but then for the retouching it's not so much creativity... Anything that's not creative can be automated.
That's part of why we were able to do this, is a lot of what we do is not so creative. The creativity is done by the people who take the picture, the photographer, the filmmaker, but then for the retouching it's not so much creativity. And even then the creativity comes in what the customer says, not what we do. We don't have a creativity license at Pixelz.
We're following directions.
Exactly, so in that regard anything that's not creative can be automated.