Companies With Large AI Projects are Calling
Alegion to Help Prepare Needed Data via Crowdsourcing
Nathaniel Gates (see photo) is CEO of Alegion, which he co-founded in 2012. Prior to Alegion, he founded Cloud49, a cloud computing solutions provider focused on the public sector. He lived and worked in Alaska for 36 years before moving to Austin, Texas in 2012. He recently spoke with AI Trends Editor John P. Desmond.
Q. What is the mission of Alegion?
A. The mission of Alegion is to get work done and get work distributed out to a workforce across the world and across this country. And when we do that by providing a service, and that service is to bring exceptional human and machine intelligence to bear against very large business challenges, largely around transitioning to an AI data-driven environment. Our goal is that we can hold the hand of customers and take them over the bridge to AI. We meet them on the near side of it, which is human judgment and human intelligence, and we cross the bridge with them to the point where we can hand off to a machine intelligence, where they can do the work with confidence and quality. That’s really the goal of our engagements with our customers.
Q. Can you talk about your crowdsourcing approach?
A. Sure, When Alegion first got started about six years ago now, we were helping organizations leverage public crowds to get work done, for example by using Mechanical Turk, an Amazon crowd. That’s a workforce made up of hundreds of thousands of people all over the world that have their unique skills and abilities that can be brought to bear against different business challenges. Business might need to categorize receipts or it might need to tag photos or it might need to be able to moderate user content when it gets uploaded to a website. Well, that work can be spread out and split out amongst these work forces to get the work done at incredible scale and with elasticity. And what I mean by those two words is the scale, it doesn’t scare us to address millions of records simultaneously, and we can do that by scaling up the workforce immediately to be able to tackle a challenge. When the work starts to fade away, the workforce gets assigned to other work. We call that elasticity where we can go up and down based off of the demand.
And so that crowdsourcing was Alegion’s business model and has been Alegion’s business model for the last several years up until about two years ago. Then we started looking at the preponderance of our customers who are using our crowd to develop training data for their AI projects. And what I mean by training data is the models which are used by organizations for the purposes of artificial intelligence, oftentimes require a supervised data set. That is a data set that has been validated and is accurate so that the model can make the right decision is in the right circumstances. So just like we train our toddlers and we tell them not to go next to the hot stove or we tell them to clean their room, we train them and we set the boundaries, and they can lean on that training and that past performance when it comes time to make future judgments. And how that relates to AI is if you’re doing something like developing a self-driving car and you want the self-driving car to know that what it sees in front of it is a child chasing a ball, that has to be a trained response. They have to have seen many instances in which a ball rolls in front of that car and somebody has labeled that as a hazard. Thus the machine is trained to know that it needs to make a judgment decision. So these organizations have been using crowdsourcing now for some time and with increasing pace to be able to create very large training sets for addressing these AI initiatives within their companies.
Q. You mentioned your business model changed about two years ago?
A. It really has. We’ve gone from being a services-based company using the crowd to work on projects, to being focused on only providing services relative to our artificial intelligence efforts. That allows us to develop our own software and our own processes to ensure that those training data sets are developed at very high quality and very specific to the customer company and its industry. So we are more focused and we have developed software and a platform to help us do the work with repeatable high quality. So that was a pretty big change for our organization. Fundamentally, it didn’t change the business model. Fundamentally, we are still distributing out large amounts of work to workforces. And we at Alegion are ensuring that we give high quality results back to the customer.
Q. How do you sell your services and what industries do you target?
A. Our business model is software as a service. Organizations subscribe to our platform, and our platform allows them to create training data. It could be data that they don’t have. They may want to go collect data or find data relative to what it is they’re trying to do. If the company is building a recommendation engine for what videos customers should watch, they might want to find 100,000 types of movies and then classify those. And so from that standpoint, we’d be collecting data and then classifying that data so that it could be ingested by their model.
Most companies already have their own data, but the data is not suitable for training because it has not been properly classified. So a large company like Coca-Cola might have huge amounts of records for bottling companies and distribution entities. But if they want to put the predictive algorithm together of where are people going to buy soda in the future and what other flavors would be congruent to past performance, they would need a company like Alegion to go through huge amounts of data that they already have and apply structure to it. That structure then would organize the data in a format that could be processed by the algorithms. That’s what we do in those cases.
The targeted industries are varied. I would say the targeted function that we’re going after is artificial intelligence initiatives within organizations and within enterprise, larger organizations. The industry has a little less relevance. Some of our largest customers are financial services organizations or government contractors. We have large customers that are media and publishing companies. What bonds them all together is they have high volumes of data that they’re trying to describe in structure. And crowdsourcing in our platform is a fantastic way to do that with an assurance of quality.
Q. How’s the company doing?
A. The company is doing great. We saw about a 10x increase in the 12 months of 2017, and about 80% of that was attributable to artificial intelligence. We’re seeing new companies with larger projects they want to do around the concept of AI training. So this growth wave we are in right now is pretty exciting, and we also know it’s very early. Everyone is still trying to figure out which way is up with artificial intelligence, how it could potentially disrupt their industry, and whether they could be the catalyst for that or whether they will get caught from behind. And so the vast majority of this market is yet to come.
Q. John: Right. So can you describe the AI that you use a little bit?
A. Sure. So we interact with AI in two different spheres. First of all, the customer has their own AI that they’re pursuing. And that’s what they’re trying to train. We don’t have a great deal of visibility into that. So, it might be an insurance company that is building an application where you can take a picture of a car and it could assess the amount of damage to the vehicle. That artificial intelligence is the customer’s responsibility. All they need us to do is to have workers draw squares around the damage, and classify how much damage there is for the training purposes.
Alegion also uses AI machine learning internally to improve our own processes and quality. So we can look at an image and use our own algorithms to determine whether the vehicle has damage. Do we think that photo is appropriate or do we not think it’s appropriate before the human ever sees it? And then as that model gets better and better, we are less dependent upon even our own humans. So we’re improving models in order to drive high-quality training data for our customers. Our customers in turn take that training data and build fantastic new artificial intelligence that’s transformative in their market.
Q. Many companies today are trying to monetize their digital content. Are any companies you are working with doing this?
A. Yes. We work with publishing companies who have a tremendous volume of digital assets, such as magazine photography or newspapers columns and archives. They may want to get that data into syndication so it can be monetized. In order to do that effectively, there has to be the right metadata so that a person searching for that content can find it easily. We might have to tag that with different attributes of a photo, for instance. So a customer would come to us with a specific taxonomy and say, “Please go through our million photos of the last 10 years and apply this taxonomy to it.” And by taxonomy, I mean the structure of the data classification system. That way, later on, when their customer is looking for a specific photo, they can search “man with a green hat” and find all the photos of a man with a green hat that are available for syndication. That’s a way that you can monetize an asset by adding metadata to it so it’s searchable.
Another way to monetize an asset is to be able to offer like or similar assets based off of what the customer is browsing. So we’ll see this a lot in like things like Netflix where you’ve just watched a movie on science fiction, and lo and behold, Netflix would like to see if you’d like to watch this other movie also about science fiction because it has a similar affinity. This requires and depends upon metadata. And that metadata is data that describes the underlying data. So it’s one thing to say, “I have a phone on my desk.” It’s another thing to say, “I have a black Polycom with the handset.” The more metadata that describes it, the more I can link it to other things that have similar affinities. An interesting point is that many customers will come to us for cost savings initiatives. They would like to use artificial intelligence or the crowd to help sell their product at a lower price or make their customer service less expensive.
We have customers with 10 years of recordings for customer service that they want to turn into a chatbot, in order to reduce the required number of customer service people that they need by say 50%. That’s success to them because they’ve reduced their customer service cost without hopefully sacrificing customer experience. But what comes out of it if we do our job right and we are able to classify the data and the conversations well enough, is a revenue-producing opportunities for the customer. The chatbot is able to lean back on the context and the associations that we have identified looking at the historical data to know that in the context of this conversation, this customer could be very well suited to buy a certain product. And the chatbot can actually suggest that.
Human intelligence or a machine intelligence in the exact same way can look at the totality of the data that trained it in the first place, and find a correlation of someone with a similar footprint who was interested in certain services. So even though these customers are typically knocking on our door looking for cost savings, we tend to find them revenue-producing opportunities with artificial intelligence. And that’s what becomes transformative in the space.
“We are an artificial intelligence services company.”
Q. What would you say is Alegion’s differentiation in the market and who is the competition?
A. A small group of us right now are doing the majority of this work; we’re just a tip of the iceberg right now. Our competition includes CrowdFlower, Mighty AI and CloudFactory. These are all companies who use crowdsourcing to build out a fantastic training data for their customers. We would differentiate ourselves from these folks on a couple different fronts. We do a little bit of a service-first approach rather than just giving a customer a command console and say, “Go make your training data.” We would really wrap that usually with a customer success team, and that customer success team would own the onus of quality for the customer. So we would give an SLA [service level agreement] to say, “We will return back to you your data at this level of quality and we will do whatever services are required in addition to our software to get it there.” And we find that customers are very interested in not having to learn another piece of software. They would love just to have their problem solved. And so that’s kind of the one differentiator that you’ll see for us.
A second one is that we are the most suited to offer higher security solutions. Customers, and rightly so, are very nervous about who has access to their data, and in turn, because sometimes their data has customer data in it. So how can I give that data to a company like Alegion who’s gonna go and add metadata to it without putting our data at risk? Alegion has multiple stages of secure data approaches that meet the customer’s internal regulatory or internal governance requirements for security. So Alegion is the secure crowd approach. We can do a number of different things to ensure that that data never leaves the customer site, is never unencrypted, is always audited, and they would be able to pass the different regulatory hurdles as necessary.
Q. How do you price your product or your service?
A. It comes in a couple different flavors depending on the customer’s requirements. If the customer has a one-and-done project and they just want to train a model for that project, we just give them a price to do that. If the customer has multiple projects that they foresee, then we would sell them a subscription to our software and then we would have an added fee for whatever services of Alegion they use.
Q. Bias is a challenging issue for businesses that want to strategically deploy AI. How do you deal with bias from your perspective?
A. We see a lot of bias and the challenge of bias in our customers. They will have developed data sets, and those data sets often have leveraged some third party to add context. So that could have been interns or that could have been sales folks on their team or maybe they sent it out to an international outsourcing company to add data, add metadata on top of their data. What they don’t understand is that that comes with bias. When you have siloed the people who are interacting with your data, the people who are describing your data are siloed and too like-minded, then they will bring their presuppositions and understandings to bear when they make a judgement. If those are too similar, then you’re going to see a trend that presumes those same presuppositions in judgment. And that’s where you see bias starting to affect your data.
So one of the fantastic fringe benefits of crowdsourcing is we can bring context from multiple countries, from multiple venues, from multiple strata of socio-economic status that will all look at a problem differently. So, for example, we saw in a customer that was building a fashion identification model, they were seeing too many photos end up with the models wearing boots. And that just seemed really unusual. And we went back and looked at the data that they had and found out that the outsourcing that they had derived the data from was from India. And in that case, they called any high-heeled shoe a boot. No one knew that. And so we ended up with that bias of culture driving noise in their data. So guess what? Their AI algorithms started calling every high-heeled shoe a boot. When they handed it to us, we were able to do descriptions from multiple different perspectives, multiple different countries and be able to round out their data set substantially. So bringing in multiple perspectives is a fantastic mechanism for reducing bias in the data.
Q. What can you say about your future plans?
A. It’s pretty exciting. We are an artificial intelligence services company. We’re similar to an oilfield services company. If they say data is the new black, then we’re an oilfield services company for AI. We really feel like we’re positioned well regardless of what the actual ambitions of the companies wanting to utilize AI is going to be. We see the value that we’re offering across most multiple industries and we’re not getting siloed up into specific use cases. We’re not getting siloed up into specific buyer personas. It’s coming from everywhere,
Everyone has an AI project right now. Everyone is trying to figure out how they can monetize their data and how they can turn the decision-making in their organization to be influenced by machine intelligence. So we’re really excited right now. That is the reason customers are calling us right now, and that in turn opens up opportunities for us to hire people. In this AI revolution going on right now, we are hiring people and we are providing jobs here in the US and beyond, in a world where everyone is concerned that the robots are going to take their job. We can’t see an end to the growth of us hiring more people to help build these data sets. Let me just describe the AI lifecycle for you; I think it might be interesting to you.
When a customer first approaches us, we see where they are in their maturity with artificial intelligence. We call that the AI lifecycle, and we describe it like a bridge to artificial intelligence. On one side of the bridge is your manual judgment and your manual processes. And on the far side of the bridge is machine judgment and machine automated processes. And that’s where you want to go. That’s disruptive in your industry and that’s transformative. The companies are on the bridge whether they know it or not. They’re either on the near side or they’re on the far side or they’re somewhere in between. And anywhere they are on the bridge, they can use Alegion, and they can use human intelligence to spur on growth.
So if they’re on the near shore of the bridge, they can use crowdsourcing and Alegion to go out and collect new data or to structure the data that they already have so that they can train a model. And once they have a model, they can use human intelligence and Alegion to validate that model, to score it, to continuously improve it, to see where it needs to get better and where it needs to be improved. And then once the model is doing really well, you start to get this plateau where, “What do I do with the ones the model can’t do? Maybe we’re fine if the model can do 85% of the judgment, but what do I do with the last 15%?” Well, that’s where humans are still there doing exception handling, and then that exception handling judgment that they do still goes back to improve the model. So regardless of where they are on that bridge in moving into AI, Alegion has help for them in the form of humans. That’s the irony of it, right?
Q. So you are hiring people?
A. We’re absolutely hiring people. We’re really excited where AI is going. As much as possible, our goal is to involve humans in those AI projects to ensure high quality and to ensure that we’re providing jobs for plain old human beings for the long term.
For more information, go to Alegion.com.