ai

How to Spot Great AI Opportunities in Your Business

shutterstock_791261374.jpg

Investment in Artificial Intelligence (AI) is growing rapidly and is increasingly affecting organizations well outside the tech sector. You probably suspect AI can dramatically improve your business, but perhaps you are not sure exactly how -- perhaps you are even worried that a start up or tech company is gunning for your business.

One question at the top of folks list I often get asked is: “How do I spot AI opportunities in my business?” My response is to start asking questions. What am I looking for during my line of questioning? I’m ultimately trying to get down to two things: 1. what are the high level business problems that could be transformational if solved and 2. which business problems are feasible to solve with AI to some meaningful degree in the short term and with increasing effectiveness over time.

Understand your existing data assets

My first line of questioning is directed at understanding what data you have today. For example, is your data largely images, video, audio, text or some other business data? What is your data describing? Is it customer profile data? Does it contain aerial imagery of crops? Is it images of patient skin lesions? Is it texts between politicians and their constituents? If your data is unstructured (e.g. imagery, audio, text), I might ask about metadata, that is, data about your data such as author, title, origination_date, or sender. If your data is structured (tabular) or semi-structured (tabular with some referenced unstructured data), is your data scattered around your business in different business unit databases? If so, what do each of those business units do, and how difficult is it to access this disparate data? The answers to these and similar questions help us to understand what data you have, how readily accessible it is, how difficult it will be to transform into AI ready data, and how we can improve its AI readiness and predictive impact over time.

Discuss and document your data intuition

Most folks have some intuition about high value things that can be done with their data. For example, an insurance company executive might have intuition that recordings of claims conversations can reveal insights about whether claims are being fairly addressed in a standardized fashion. A manager of a roof top construction company might think inspections of their roof top drone footage could be automated. An HR leader in a company might believe they can predict when an employee is no longer engaged based on how they communicate in platforms like Slack. A physician might realize they are not that great at understanding exactly how to classify a skin lesion anomaly like a purpura. One rule of thumb in developing and understanding your intuition about what can be down with your data is to

look for wherever humans are acting more like robots, that is, spending a lot of time looking for patterns.

So what do we mean by looking at patterns? It might help to think about this in the context of a particular type of data. For example, looking for patterns in an image might mean counting objects, like the number of people on a bus, or the density of fish in a school. In a video, looking for patterns might include quantifying things over time, for example, how much coral is in an area being studied by marine biologists, how much fat is in a human body as seen from an in body camera, or how much vertical head oscillation is present in a runner.

shutterstock_565637659.jpg

Looking for patterns in text, might be about how many times a person or topic is mentioned over time, for example, one of our customers included a group of historians trying to understand what was known and by whom in the state department during the Iranian revolution. In text, we also might look for sentiment patterns, that is, many categories of ways to say something positive, and many categories of ways to say something negative. Finally, in business data, the

patterns might manifest themselves in a playbook of rules like heuristics for determining unhappy customers or knowing which customers are likely to stick around for a long time.

Often times, departments explicitly document these playbooks, even iterating and improving them overtime. AI algorithms are great at matching and often outperforming humans at predicting specific things, like customer happiness, likelihood of quitting, or total lifetime value. In addition, AI algorithms are often more dynamic than traditional, manually intensive, statistical techniques and usually outperform humans, if provided the right training data, at regularly adapting playbook rules, such as which attributes are most predictive of say, customer happiness, likelihood of quitting, or fraudulent activity.

List your high value business problems

I usually start by looking for one of the following:

  • a major efficiency improvement that might yield new capabilities and result in important new business opportunities

  • a significant business cost which is imposed or will be imposed if efficiency improvements are not found

  • an impactful action if taken at the right time that will save or make significant money or positively affect an important outcome

shutterstock_460717630.jpg

Let's look at how we can identify significant efficiency improvements. Brainstorming answers for a few questions might help tease these out:

  • What are your employees spending a lot of time on?

  • When servicing your customers, where do you find bottlenecks occurring?

  • Are there activities where your employees just cannot keep up?

And thoughtful discussion around these questions might tease out impactful actions that could, if taken at the right time, make a significant impact:

  • Which outcomes really matter to your business?

  • What actions, however difficult or expensive, could help improve these outcomes?

Here are a few example statements that came out of similar lines of questioning above of high value business problems. These statements might help illustrate what you should be after:

  • Physicians currently fail to diagnose heartbeat anomalies 40% of the time.

  • We currently lose 18% of our transactions to fraud.

  • Our customers currently spend $10 on average and we need to raise it to $100.

  • Our reviewers just can’t keep up with the new content they must review

  • Many of our customers purchase the wrong plans, then quit a few months later. If only we could direct them to the right ones up front.

  • Experts teach students 1 on 1 in our premium service. This only works in the wealthy west, if only we could drop this cost by 10x, we could address entirely new markets in poorer nations.

Distill high value business problems to AI opportunities

The next step can be awkward for those not used to applying AI solutions to a wide variety of problems. But first, we discuss some high level AI basics. AI solutions typically boil down to a few types of solutions. The first type are supervised learning problems where many examples of specific outcomes are provided and the system is then trained and evaluated on its ability to predict outcomes for examples it has never seen. The second type is called unsupervised learning where patterns are naturally discovered in the data. These patterns can in turn be used to determine other things to predict, to find additional training examples for known things to predict, or to better understand the thing being studied, e.g. reasons a customer might quit, types of scenes in a video, types of images in a collection, etc.

Screen Shot 2019-05-10 at 5.48.47 PM.png

Using this basic AI knowledge, we can cycle back to some of our example statements above and stack rank a set of AI opportunities.

Screen Shot 2019-05-10 at 3.56.50 PM.png

Stack rank your AI opportunity candidates

Once you have your list of AI opportunities, and you understand which high value business problems they address, you are ready to stack rank them. You are, however, missing one key ingredient, namely, feasibility. You next need to determine the feasibility of any AI solution you can build for your problem. Assessing AI solution feasibility has many aspects. To name a few:

  • Likely achievable efficacy (how accurate will it be) in the near and far time horizons.

  • How many resources (people and money) are required for a solution

  • How difficult is it to acquire and label (where applicable) ground truth (i.e. training and test) data

Assess your achievable efficacy

Regarding assessing achievable efficacy, the obvious thing to do is try to actually build and evaluate models. Often times I see folks become paralyzed by the size or breadth of their machine learning problem. I recommend sampling early on to deal with size as the key at this stage to simply assess feasibility. Scaling training is something I see too many people worry about prematurely, it is almost always best to start small and on one machine to assess whether the problem is worth investing in. Breadth is another aspect where it is easy to be overwhelmed. I was recently working on a problem with a few hundred columns of data, about 20 of which were potentially viable as target variables (things to predict). It is easy to spend too much time making sure all potential feature variables are incorporated into an analysis, or all potential targets are addressed. I recommend at this stage, do the absolute minimum amount of work in a first pass to check for signal and sanity check efficacy numbers. You may need to make another pass or two, but you may disprove feasibility very quickly, and it is important not to spend excessive time on the wrong business problem.

Assess your resources

Once you have an idea of efficacy expectations, it is much easier to determine resources. For example, you will likely know whether you are talking about a few models or many. You will likely know whether a general model is sufficient, whether you need many independently fine tuned models, or whether you need both. You will also likely have a much better idea of your computational requirements, due in large part to the number and type of models, as well as lessons gleaned from observing your models in action during assessment.

Assess your ground truth development

In many cases, you will need a growing body of examples (ground truth) from which to train and test your models. To assess feasibility, I recommend just diving in and labeling a few hundred items yourself or with your data scientists. This is a great way to understand the cognitive complexity of the task as well as to understand the data much more intuitively. One classic mistake I see often are data scientists unwilling to label their own data. Obviously once you’ve hit some scale, this is a necessity, but in the early stages, while the label distributions are still being determined, task complexity is still being understood, it is essential that data scientists get hands on with the labeling process. While labeling, you will understand how to scale the labeling process. For example, we often rely on our trusted annotators to label text into hundreds of categories, this is a cognitively complex task. When our data scientists actually label themselves, and are confronted with making category boundary guidelines, such as when to label something as a negative vs neutral (is “you guys were fine” negative or just something more explicit like “you guys were bad”. Part of assessing the development path of your ground truth really comes down to questions like:

  • how many labels can one human do in what period of time?

  • how much does the time it takes to label select items vary?

  • how much human time does it take to correct model errors?

Complete ranking AI candidate opportunities

With an idea of achievable efficacy and the resources required to obtain it, you should now be in a good position to rank your AI opportunities by feasibility. You can make a light pass through the above three steps and only iterate as you get serious about pursuing a particular AI opportunity. For example, an experience data scientist can just look at a few hundred images or video frames and know roughly what type of accuracy they can expect in a few weeks, months or years of time.

Sanity check your AI opportunities

Now that you have a stack ranked list of your AI opportunities and a good idea about their feasibility, you can take a look at the most important again, in the context of your high value business problem and see whether pursuing the AI opportunities really will move the needle a lot for your business. Note, this is not always the case. Sometimes we tease out an AI opportunity, but realize after further reflection that solving it does not really move the needle for the business. And of course, once you have identified your great AI opportunities, and they stood up to significant critique, go forth and get a prototype into the field as fast as you can.

According to a recent McKinsey report, “You don’t have to go it alone on AI -- partner for capability and capacity. Even large digital natives such as Amazon and Google have turned to companies and talent outside their confines to beef up their AI skills.” Obviously we at Xyonix agree.

Need help spotting and vetting great AI opportunities in your business? Contact us, we’d love to help.

Using AI to Improve Sports Performance & Achieve Better Running Efficiency

Screen Shot 2019-04-29 at 11.29.07 AM.png

Professional sports teams leverage advanced technology to take them over the top, outfitting players with a suite of sensors to measure, quantify, and analyze every aspect of player and team performance. Amateur athletes seek help from professional trainers but the associated session and equipment cost can be prohibitive. With the ubiquity of video recorders embedded in today’s phones and modern advancements in computer vision technology, it begs the question:

Can amateur athletes improve their performance using artificial intelligence and nothing more than a smart phone? As an AI practitioner and a dedicated runner, I decided to find out.

1.25 million. That’s the approximate number of running steps that I took to train and complete my last marathon in 2016. With an impulsive impact at each step, overtraining and poor form can promote undue strain on muscles, joints, and tendons and the cumulative effect can lead to serious injury. I learned this lesson all too well in 2012 when I tore my Achilles tendon during training and had to abandon running for nearly two years as a result. At that time, short sprints to catch a bus or playing with my kids on the soccer field ultimately led to discomfort and pain. In 2014, I finally found relief and recovery through surgery but was told by my physical therapist, “you probably should never run another marathon again” as my aging body was “not as resilient as it once was.''  Not the news I wanted to hear. Certainly, I am not alone in confronting a running injury. In fact, it is estimated that 40% to 50% of all runners experience at least one injury on an annual basis [1]. The popularity of the sport has also risen dramatically over the last decade, with an estimated +40.43% growth in the number of people participating in marathons worldwide from 2008 to 2018 [2].

What type of sports data do you have? What types of improvements or moneyball type analysis are you interested in? Anything else we should know?

Marathon runners, by their nature, are determined folk and I wasn’t about to let my physical therapist thwart my aspirations to continue running. However, I was mindful of my form, which I theorized was the primary cause of my injury, outside of ignoring some of the aches and pains I experienced during training.  I adopted a form of running thought to attenuate risk of injury while optimizing efficiency [3, 4]. Some basic concepts of this form are:

  • maintain a high cadence of 180+ steps/min

  • limit ground contact time and vertical motion

  • display consistent and symmetric motion with left/right appendages

  • strike the ground with a slightly bent knee, forward of the ankle at impact

  • maintain a forward leaning posture from the feet to the head, forming a “straight” invisible line between the ankle, hip, and shoulder at impact

  • replace dorsiflexing + heel strike with a mid-front foot strike followed by a push backwards and upwards

Many of these aspects are feasible for the average runner but are difficult to accomplish by feel alone. Sports watches loaded with sensors coupled with chest straps provide excellent feedback by measuring various performance metrics but are limited in their ability to convey motion symmetry and running form. As a complement to the data I receive from my sports watch, I wanted a tool that I could use to provide visual feedback on my running form and facilitate quantification of relevant performance and form metrics. My colleagues and I at Xyonix were exploring some intriguing pose estimation tools that I thought might be able to help.

Body pose estimation is an advanced AI technology that automatically generates a skeleton for people in each frame of a video. The results can be powerful for sports improvement

as human motion and related kinematics can be tracked and analyzed. Pose estimation models identify pre-defined points of interest on a human body, such as joints and organs, which are subsequently linked to form a computer generated “skeleton” of each person in an image. Skeletal motion can be tracked across multiple frames of a video and translated to estimate body kinematics, which are used directly to assess running performance. The only hardware that is required is your mobile phone to record a video.  

VISUAL OBSERVATIONS

Consider the following mobile phone video recording of my daughter and I running at a local park. Overlaid on each frame of the videos are:

  • Colored line segments comprising computer generated body skeletons as predicted by a body pose estimation model.

  • Trails of red, blue, and white circles representing the latest history of nose, right ankle, and left ankle positions, respectively.

  • Colored rectangular regions whose vertical span indicates the maximum range of vertical oscillation encountered in the displayed histories.

The isolation of a single human body against a backdrop of sticks, branches, trees, and fencing is impressive given that these components could mistakenly be construed as body parts. While the pose estimation model is capable of identifying multiple figures at multiple depths in a scene [5], we recorded only one runner at a time for visual clarity and to facilitate isolated analysis.

Visual inspection of the videos shows a striking difference between my running form and that of my daughter:

  • My daughter’s strides are more consistent and symmetric, as indicated by the left ankle (white dot) and right ankle (blue dot) trails. I tend to raise my right ankle higher than my left in the back stroke, which likely means that I’m not engaging my left gluteus maximus as much as my right. It also means that my right foot has a slightly longer distance to travel to prepare for the next step. This issue may seem trivial but the lack of symmetry can throw off my timing, which can lead to improper foot placement, which can lead to harsher impacts, which can lead to joint, muscle, or tendon damage. Over hundreds of thousands of steps, stride asymmetry can promote injury.

  • The average vertical oscillation of my daughter’s head, indicated by the trail of nose positions (red dots) tracked over time, is seemingly less than mine during a typical stride.

QUANTIFYING PERFORMANCE METRICS

Below is a plot of the vertical motion of various body parts tracked over time throughout the video. These series are colored based on ground height, which was estimated via a “pixels to inches” conversion behind the scenes.

Screen Shot 2019-04-29 at 11.50.40 AM.png

With these data in hand, we are set to perform a more detailed analysis. Our goal is to confirm qualitative observations with quantitative metrics. We follow each estimated metric with a TAKEAWAY section, which identifies practical advice for future training.

Let’s begin with cadence, which is theorized to be positively correlated with running efficiency [6]. Cadence is defined as the average number of steps over a given time interval, typically given in units of steps per minute. We estimate cadence as follows:

  • Isolate left and right ankle series.

  • Remove outliers and perform local interpolation to refill gaps.

  • Smooth and detrend each series and use the result to estimate local minima, whose locations identify impact locations in time, I.e., step times.

  • Estimate the cadence as the median number of steps per minute.

Screen Shot 2019-04-29 at 11.53.52 AM.png

My cadence is estimated at 169 steps/min while my daughter’s is 173 steps/min. These results match well with typical cadence estimates as reported independently by GPS running watches.

TAKEAWAY If we are to adhere to the advice to run at a cadence of 180+ steps/min, it looks like we both need to pick up the tempo a bit.

We also can use the ankle series to quantify the median difference in left/right ankle elevations on the backward portion of the running stride, a metric we will call median stride asymmetry:

  • Extract left and right detrended ankle series from cadence assessment.

  • Find local maxima.

  • Calculate elevation differences between left-right pairs and report the median of those differences.

Screen Shot 2019-04-29 at 12.00.58 PM.png

The results show that I have a whopping 5.29 inch median stride asymmetry while my daughter’s motion exhibits a more acceptable 2.34 inch value.

TAKEAWAY I need to focus on engaging my left gluteus maximus more and matching the height of my right ankle during the backstroke. Generally, I need to be more aware of my tendency to stride asymmetrically.

Finally, let us quantify and compare the vertical oscillation of the head using the following steps:

  • Isolate the nose position history.

  • Find local minima.

  • Detrend by subtracting a spline model fit using the minima from the original series.

  • Find the local maxima of detrended series

  • Recenter the series by subtracting the mean (optional, but lends to a nice visual display).

  • Calculate the absolute difference between successive minima-maxima pairs and report the median value.

Screen Shot 2019-04-29 at 12.01.10 PM.png

TAKEAWAY My daughter does a slightly better job in using her energy for forward propulsion, wasting less of her energy in vertical motion. This is something to continue to monitor over time but likely will naturally diminish by lessening my stride asymmetry,  promoting a stronger forward lean from my toes, and by increasing my cadence.

SUMMARY & EXTENSIONS

We have demonstrated the use of pose estimation for quantifying three running performance metrics.

Screen Shot 2019-04-29 at 12.01.26 PM.png

These metrics, combined with the visual feedback obtained from pose estimation videos, have given me something solid to work on in future training sessions. I look forward to recording and processing another video to verify that my efforts to adopt a better running form are working. I can validate these improvements quantitatively via the proposed metrics with a goal of making me a more efficient runner.

While we have demonstrated the efficacy of using pose estimation to better one’s running, the story doesn’t end there. Pose estimation can be used in a wide variety of important applications. Here is a list describing a few imports applications of body pose estimation [7]:

  • Sports

    • kinematic analysis of tackles in American football

    • golf swing dynamics

    • tennis swing form

  • Assisted living

    • help future robots correctly interpret body language

  • Intelligent driver assistance

    • automated estimation of driver inebriation levels

  • Medical Application

    • detection of postural issues such as scoliosis

    • physical therapy

    • study of cognitive brain development in young children by monitoring motor functionality

    • identifying disability onset via correlated motion

    • behavioral understanding

  • Video games

    • avatar movement

    • gesture and motion recognition

Screen Shot 2019-04-29 at 12.01.55 PM.png

I am happy to report that, since adopting a new running form, I have been injury free for years now and feel less tired during training runs, which may be a testament to the efficacy of the technique. Even better, my children have taken an interest in running, achieving some pretty lofty goals at their young ages while spending precious bonding time with their dad. Pictured is one of those happy moments, where one of us won first place in their division. Can you predict which one of us was the victor?

Have sports performance data? Want to automatically recommend performance improvements to athletes? Contact us, we’ve taught machines to understand all kinds of video imagery and other performance improvement data — we might be able to help.


REFERENCES

  1. Fields, K.B., Sykes, J.C., Walker K.M., and Jackson, J.C. (2010). Prevention of running injuries. Current Sports Medicine Reports, May-Jun;9(3):176-82. doi: 10.1249/JSR.0b013e3181de7ec5.

  2. Marathon Statistics Worldwide, https://runrepeat.com/research-marathon-performance-across-nations

  3. Danny Dreyer and Katherine Dreyer, ChiRunning: A Revolutionary Approach to Effortless, Injury-Free Running (May 2009). Atria Books.

  4. https://www.runnersworld.com/training/a20854024/what-makes-a-running-stride-efficient/

  5. Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh (2017) Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. Computer Vision and Pattern Recognition. Accepted as CVPR 2017 Oral. arXiv:1611.08050.

  6. https://www.mcmillanrunning.com/cadence/

  7. https://en.wikipedia.org/wiki/Articulated_body_pose_estimation