We’ve recently started collaborating with Jim Chiang – the hands-on AI architect at Apttus, on applying AI/NLP technologies in the legal industry to reduce contract risk and automate contract negotiation. Jim specializes in Big Data Analytics, Machine Learning, Deep Learning, Data Mining, Predictive Analytics, SaaS and Cloud Analytics. With a colorful career history of working at AT&T Bell Laboratories, IBM, Conviva and more, we couldn’t pass up the opportunity to ask Jim some questions as he came to visit our headquarters in Novi Sad.
What is your role at Apttus? How did you come to the idea of collaborating with Vivify Ideas and forming the AI team? How has cooperation with Vivify Ideas been so far and in what form?
It’s actually hard to find and be able to intelligently select firms in Serbia because it’s pretty far away, but we engaged with Vivify Ideas to do some UX outsourcing work. And through that time, we found that there were people in the team that were very competent and capable of doing much more. Based on that experience we decided to explore this idea of whether it would make sense to do this project.
I think the reality is that when you talk about AI, just because it's such a hot concept, anybody who has any working experience in this space will likely be hired very, very quickly. That’s kind of unfortunate because I think there’s a lot of money and investment that’s poorly wasted because it’s really critical for you to actually find the right talent to do those things.
In Novi Sad, I think there is a lot of talent and we’ve fortunately been able to find it and work very closely with Vivify Ideas to articulate what this vision is and the collaboration necessary to turn this idea into a fully functioning product. That’s a hard thing to do, especially given our geographies, but it really is important to actually have talent with all these disciplines to be able to work intelligently together. So it’s not just a one-way collaboration in terms of us explaining each of every necessary step to do this, but collaboration in a sense of “well, here’s what we want to get to, how do we get there with that?”.
So there is a lot of back and forth happening between both parties when discussing this particular product idea?
Yes and I think that whenever you have projects like this, the more successful ones are an active collaboration between parties, as opposed to something that’s just a purely outsource relationship. So I think that’s actually been pretty fulfilling for us. Also, Vivify Ideas has been very effective in helping us determine where the talent is, getting the word out about what we’re trying to do and transferring some of the excitement.
I think you guys have built a culture that actually retains people and I think that’s critical. For specialized talent, you want to make sure you’re doing some investments to kind of bring them up to speed and also make sure they’re equipped to take it to the next level.
The talk you held here in Novi Sad in March has certainly sparked a lot of interest from our local AI community, especially as it seems like a very large-scale project. Would you agree?
Well, this is an enormous, multiyear vision that we’re trying to accomplish. But it doesn't necessarily mean that the whole vision has to be realized before someone actually understands the value of it. So we do anticipate that probably within a six to nine-month timeframe, we are able to ship a working product that actually adds value to our existing customers. That's basically the initial goal. These are the are incremental milestones we have to achieve in order for us to cut and realize the long term vision. No project that transforms how people do things can happen overnight.
So the way to actually position these types of things is to think about AI first in terms of ways to detect the information and then comes the second order of things, which is basically recommending next steps. That’s actually a much broader problem for how it works.
The first aspect of actually detecting for clauses and contracts, detecting for where the information resides in contracts, as well as detecting certain types of risk profiles within contracts are things that are pretty mature terms of AI space. Those are the things that we expect to be able to deploy successfully.
And then as AI advances happen, you're going to see more and more innovation that's possible in the next six to 12 months that would then turn the product.
What are other large-scope projects that you’ve been involved in that use NLP?
Before this, I was working with a company called Conviva which did the monitoring of large numbers of video streams across the worldwide environment. So as an example, we’ve had enormous amounts of data aggregation that happened on a nightly basis, you had video streams for things like the Super Bowl, the Masters and the World Cup, as well as episodic content for HBO, with shows such as Game of Thrones and similar.
So you see huge amounts of network traffic and data that happen for specific events and that allowed us to experiment with different types of machine learning, as well as new AI technology and see what works and what doesn’t.
The reason why Apttus is interesting is that when you talk about artificial intelligence, you’re talking about being able to decipher or infer different types of information based on what is called noisy data. For example, if you have an image which contains a lot of information, then only a very small subset of it relates to the “cat image” we were talking about. AI is very good at being able to see through all the noise in the image and find exactly what you’re looking for.
That class of problems is really only in four different areas – it’s in images, in sound, in video and the last one is text. Now, when dealing with large quantities of text and things like books, people are experimenting with ways to synthesize books, right on the fly, which is also a really interesting area of research. But the other crucial areas are potential legal contracts and the transactional elements between companies, which involve huge amounts of cycles spent on them. Not to mention large amounts of texts that we are able to actually leverage.
What is your impression on the state of AI here in Novi Sad?
Part of the reason why I’m here is because I think that the education system in Serbia, and in Novi Sad, in particular, is very, very strong. AI is very different from most computer science disciplines, in the sense that there are not as many well-defined answers to a lot of different problems, because you see a lot of research and Novi Sad is one of the institutes that are actually publishing research. But research is actually quite global in nature. And we're actually at the point in AI where,
as opposed to everybody talking about it, we're very much in the hype cycle still. People are so excited about this topic, because most people think that Ai is one of those things that are going to come back and kill everybody. Because the movies talk about that.
And that may or may not be true, okay? I cannot vouch for whether something like that will happen in the future or not. But what I do know is that based on the current advances for AI, there are very practical applications that are still being explored. And it’s part of the reason why we are very much looking for some of the brightest engineers, really the top of the class in terms of being at the forefront of actually trying to build a core AI engineering team in Novi Sad. Which is quite different from a lot of other projects that are going on in Novi Sad, right?
Whereas a lot of the outsourcing activities to Novi Sad that are non-core, non-IP, what we’re talking about is a core AI engineering team that is actually building AI, and building IP within. We’re talking about the core algorithm development that’s critical for how well the product actually works.
In terms of AI in Novi Sad, one of the great aspects of the Serbian educational system is that regardless of the computing discipline, you usually have to do a lot of core mathematics and statistics education. And AI is unique in the sense that it’s partly mathematical, partly data science and partly computer science. So you need somebody who has a cross of those skills to be effective in that way. It’s a very unique blend of talents that, unless you actively look for it, is hard to find. It’s very hard to find.
If we’d focus particularly on university students, what AI knowledge are they currently getting in our universities? Do you think there will be potentially interesting profiles by the end of their studies?
Uhm, it depends. I think that there has to be a natural curiosity in terms of how these disciplines actually mix together. The AI actually has a very strong academic foundation. Just to give you a little history, the concept of AI was probably invented in the 1950s and it had no computing power. It was pretty much useless, born as a theoretical concept. And for the most part, that actually was the case for many, many years to follow, up until 2012. It was only seven years ago, that a number of grad students in Toronto found the occasion when their professor left town and took the basis of older AI technology concepts, running it with a larger computing infrastructure. What was remarkable about that was that contrary to what most people’s expectations were, the empirical results that came out of that study were absolutely groundbreaking. And that only happened in 2012 in the lab of Professor Geoffrey Hinton.
And so if anyone goes and says “Well, I’ve been an expert in AI for the past 10 years, it’s completely fictional as there was only a handful of people who actually kept the concept of AI around, and it’s since 2012 that we’ve had a number of advances, in 2014 and 2015. All of a sudden there were a lot more research dollars invested in that, especially in Silicon Valley and most of the larger companies because they all believe in the future. However, if you think about how many applications have come out of the AI space, it’s still very, very limiting. We’re still in the infancy of actually introducing AI applications that really work for people.
And at what point can we, with certainty, refer to them as truly AI applications?
I think that's actually a very important distinction and I’ll explain what that means to me. So if you looked at the human brain, it is kind of the inspiration behind AI. Now, before AI, and before 2012, most of the advancements were basically related to very simple regression problems. For example, if you had some type of predictive model that says “based on the number of bedrooms, this house costs that much” type of thing, you could roughly curve it and make it decently precise. And you’re really only talking about a small amount of data. The thing I am describing here is a simplification of a traditional machine learning and many people actually mistake those terms.
The difference between traditional machine learning and artificial intelligence is as dramatic as can possibly be. Traditional approaches in computer vision use traditional learning methods to take an image and try to extract all of the interesting aspects of the image, that then reduces to a small amount of numeric representations that then say this is a number one or number three, or something along those lines.
The new advances in terms of computer vision based on artificial intelligence, on the other hand, is very much geared towards how a computer element associates with an artificially modelled neuron – a basic element of brain operation, as well as the optical nerve in terms of how a human actually perceives sight to be able to identify a cat in the image. That’s something that we do intuitively, but only recently did we actually try to replicate these things – how the computing infrastructure is able to detect whether a cat is actually in that image or not. And that's completely without any of these additional labor-intensive humans, human engineering, human feature engineering steps, that try to reduce it into just a small number of numbers that actually indicates this is a cat.
The reason why it’s so important is because that basic technology is then used to detect a whole range of objects, whereas the traditional machine learning really isn’t able to generalize these harder problems. Because artificial intelligence is geared towards the human and our constructs, it is much more capable of generalizing to larger problem domains that are very difficult, if not impossible to traditional machinery.
That’s why you see all these advances in autonomous vehicles, sound and audio detection and all these devices that are able to understand the spoken word. So there are a number of things that have happened in order to make artificial intelligence practical, and some of those are basically large amounts of data because, without a large amount of data, you can’t really use AI.
In the second part of the interview with Jim, we’ll be discussing the recent advancements of AI, Jim’s predictions on the “next big thing” and common obstacles in AI design. If you’re remotely interested in this topic, you don’t want to miss out on that one.