How to Spot Great AI Opportunities in Your Business


Investment in Artificial Intelligence (AI) is growing rapidly and is increasingly affecting organizations well outside the tech sector. You probably suspect AI can dramatically improve your business, but perhaps you are not sure exactly how -- perhaps you are even worried that a start up or tech company is gunning for your business.

One question at the top of folks list I often get asked is: “How do I spot AI opportunities in my business?” My response is to start asking questions. What am I looking for during my line of questioning? I’m ultimately trying to get down to two things: 1. what are the high level business problems that could be transformational if solved and 2. which business problems are feasible to solve with AI to some meaningful degree in the short term and with increasing effectiveness over time.

Understand your existing data assets

My first line of questioning is directed at understanding what data you have today. For example, is your data largely images, video, audio, text or some other business data? What is your data describing? Is it customer profile data? Does it contain aerial imagery of crops? Is it images of patient skin lesions? Is it texts between politicians and their constituents? If your data is unstructured (e.g. imagery, audio, text), I might ask about metadata, that is, data about your data such as author, title, origination_date, or sender. If your data is structured (tabular) or semi-structured (tabular with some referenced unstructured data), is your data scattered around your business in different business unit databases? If so, what do each of those business units do, and how difficult is it to access this disparate data? The answers to these and similar questions help us to understand what data you have, how readily accessible it is, how difficult it will be to transform into AI ready data, and how we can improve its AI readiness and predictive impact over time.

Discuss and document your data intuition

Most folks have some intuition about high value things that can be done with their data. For example, an insurance company executive might have intuition that recordings of claims conversations can reveal insights about whether claims are being fairly addressed in a standardized fashion. A manager of a roof top construction company might think inspections of their roof top drone footage could be automated. An HR leader in a company might believe they can predict when an employee is no longer engaged based on how they communicate in platforms like Slack. A physician might realize they are not that great at understanding exactly how to classify a skin lesion anomaly like a purpura. One rule of thumb in developing and understanding your intuition about what can be down with your data is to

look for wherever humans are acting more like robots, that is, spending a lot of time looking for patterns.

So what do we mean by looking at patterns? It might help to think about this in the context of a particular type of data. For example, looking for patterns in an image might mean counting objects, like the number of people on a bus, or the density of fish in a school. In a video, looking for patterns might include quantifying things over time, for example, how much coral is in an area being studied by marine biologists, how much fat is in a human body as seen from an in body camera, or how much vertical head oscillation is present in a runner.


Looking for patterns in text, might be about how many times a person or topic is mentioned over time, for example, one of our customers included a group of historians trying to understand what was known and by whom in the state department during the Iranian revolution. In text, we also might look for sentiment patterns, that is, many categories of ways to say something positive, and many categories of ways to say something negative. Finally, in business data, the

patterns might manifest themselves in a playbook of rules like heuristics for determining unhappy customers or knowing which customers are likely to stick around for a long time.

Often times, departments explicitly document these playbooks, even iterating and improving them overtime. AI algorithms are great at matching and often outperforming humans at predicting specific things, like customer happiness, likelihood of quitting, or total lifetime value. In addition, AI algorithms are often more dynamic than traditional, manually intensive, statistical techniques and usually outperform humans, if provided the right training data, at regularly adapting playbook rules, such as which attributes are most predictive of say, customer happiness, likelihood of quitting, or fraudulent activity.

List your high value business problems

I usually start by looking for one of the following:

  • a major efficiency improvement that might yield new capabilities and result in important new business opportunities

  • a significant business cost which is imposed or will be imposed if efficiency improvements are not found

  • an impactful action if taken at the right time that will save or make significant money or positively affect an important outcome


Let's look at how we can identify significant efficiency improvements. Brainstorming answers for a few questions might help tease these out:

  • What are your employees spending a lot of time on?

  • When servicing your customers, where do you find bottlenecks occurring?

  • Are there activities where your employees just cannot keep up?

And thoughtful discussion around these questions might tease out impactful actions that could, if taken at the right time, make a significant impact:

  • Which outcomes really matter to your business?

  • What actions, however difficult or expensive, could help improve these outcomes?

Here are a few example statements that came out of similar lines of questioning above of high value business problems. These statements might help illustrate what you should be after:

  • Physicians currently fail to diagnose heartbeat anomalies 40% of the time.

  • We currently lose 18% of our transactions to fraud.

  • Our customers currently spend $10 on average and we need to raise it to $100.

  • Our reviewers just can’t keep up with the new content they must review

  • Many of our customers purchase the wrong plans, then quit a few months later. If only we could direct them to the right ones up front.

  • Experts teach students 1 on 1 in our premium service. This only works in the wealthy west, if only we could drop this cost by 10x, we could address entirely new markets in poorer nations.

Distill high value business problems to AI opportunities

The next step can be awkward for those not used to applying AI solutions to a wide variety of problems. But first, we discuss some high level AI basics. AI solutions typically boil down to a few types of solutions. The first type are supervised learning problems where many examples of specific outcomes are provided and the system is then trained and evaluated on its ability to predict outcomes for examples it has never seen. The second type is called unsupervised learning where patterns are naturally discovered in the data. These patterns can in turn be used to determine other things to predict, to find additional training examples for known things to predict, or to better understand the thing being studied, e.g. reasons a customer might quit, types of scenes in a video, types of images in a collection, etc.

Screen Shot 2019-05-10 at 5.48.47 PM.png

Using this basic AI knowledge, we can cycle back to some of our example statements above and stack rank a set of AI opportunities.

Screen Shot 2019-05-10 at 3.56.50 PM.png

Stack rank your AI opportunity candidates

Once you have your list of AI opportunities, and you understand which high value business problems they address, you are ready to stack rank them. You are, however, missing one key ingredient, namely, feasibility. You next need to determine the feasibility of any AI solution you can build for your problem. Assessing AI solution feasibility has many aspects. To name a few:

  • Likely achievable efficacy (how accurate will it be) in the near and far time horizons.

  • How many resources (people and money) are required for a solution

  • How difficult is it to acquire and label (where applicable) ground truth (i.e. training and test) data

Assess your achievable efficacy

Regarding assessing achievable efficacy, the obvious thing to do is try to actually build and evaluate models. Often times I see folks become paralyzed by the size or breadth of their machine learning problem. I recommend sampling early on to deal with size as the key at this stage to simply assess feasibility. Scaling training is something I see too many people worry about prematurely, it is almost always best to start small and on one machine to assess whether the problem is worth investing in. Breadth is another aspect where it is easy to be overwhelmed. I was recently working on a problem with a few hundred columns of data, about 20 of which were potentially viable as target variables (things to predict). It is easy to spend too much time making sure all potential feature variables are incorporated into an analysis, or all potential targets are addressed. I recommend at this stage, do the absolute minimum amount of work in a first pass to check for signal and sanity check efficacy numbers. You may need to make another pass or two, but you may disprove feasibility very quickly, and it is important not to spend excessive time on the wrong business problem.

Assess your resources

Once you have an idea of efficacy expectations, it is much easier to determine resources. For example, you will likely know whether you are talking about a few models or many. You will likely know whether a general model is sufficient, whether you need many independently fine tuned models, or whether you need both. You will also likely have a much better idea of your computational requirements, due in large part to the number and type of models, as well as lessons gleaned from observing your models in action during assessment.

Assess your ground truth development

In many cases, you will need a growing body of examples (ground truth) from which to train and test your models. To assess feasibility, I recommend just diving in and labeling a few hundred items yourself or with your data scientists. This is a great way to understand the cognitive complexity of the task as well as to understand the data much more intuitively. One classic mistake I see often are data scientists unwilling to label their own data. Obviously once you’ve hit some scale, this is a necessity, but in the early stages, while the label distributions are still being determined, task complexity is still being understood, it is essential that data scientists get hands on with the labeling process. While labeling, you will understand how to scale the labeling process. For example, we often rely on our trusted annotators to label text into hundreds of categories, this is a cognitively complex task. When our data scientists actually label themselves, and are confronted with making category boundary guidelines, such as when to label something as a negative vs neutral (is “you guys were fine” negative or just something more explicit like “you guys were bad”. Part of assessing the development path of your ground truth really comes down to questions like:

  • how many labels can one human do in what period of time?

  • how much does the time it takes to label select items vary?

  • how much human time does it take to correct model errors?

Complete ranking AI candidate opportunities

With an idea of achievable efficacy and the resources required to obtain it, you should now be in a good position to rank your AI opportunities by feasibility. You can make a light pass through the above three steps and only iterate as you get serious about pursuing a particular AI opportunity. For example, an experience data scientist can just look at a few hundred images or video frames and know roughly what type of accuracy they can expect in a few weeks, months or years of time.

Sanity check your AI opportunities

Now that you have a stack ranked list of your AI opportunities and a good idea about their feasibility, you can take a look at the most important again, in the context of your high value business problem and see whether pursuing the AI opportunities really will move the needle a lot for your business. Note, this is not always the case. Sometimes we tease out an AI opportunity, but realize after further reflection that solving it does not really move the needle for the business. And of course, once you have identified your great AI opportunities, and they stood up to significant critique, go forth and get a prototype into the field as fast as you can.

According to a recent McKinsey report, “You don’t have to go it alone on AI -- partner for capability and capacity. Even large digital natives such as Amazon and Google have turned to companies and talent outside their confines to beef up their AI skills.” Obviously we at Xyonix agree.

Need help spotting and vetting great AI opportunities in your business? Contact us, we’d love to help.

Using AI to Improve Sports Performance & Achieve Better Running Efficiency

Screen Shot 2019-04-29 at 11.29.07 AM.png

Professional sports teams leverage advanced technology to take them over the top, outfitting players with a suite of sensors to measure, quantify, and analyze every aspect of player and team performance. Amateur athletes seek help from professional trainers but the associated session and equipment cost can be prohibitive. With the ubiquity of video recorders embedded in today’s phones and modern advancements in computer vision technology, it begs the question:

Can amateur athletes improve their performance using artificial intelligence and nothing more than a smart phone? As an AI practitioner and a dedicated runner, I decided to find out.

1.25 million. That’s the approximate number of running steps that I took to train and complete my last marathon in 2016. With an impulsive impact at each step, overtraining and poor form can promote undue strain on muscles, joints, and tendons and the cumulative effect can lead to serious injury. I learned this lesson all too well in 2012 when I tore my Achilles tendon during training and had to abandon running for nearly two years as a result. At that time, short sprints to catch a bus or playing with my kids on the soccer field ultimately led to discomfort and pain. In 2014, I finally found relief and recovery through surgery but was told by my physical therapist, “you probably should never run another marathon again” as my aging body was “not as resilient as it once was.''  Not the news I wanted to hear. Certainly, I am not alone in confronting a running injury. In fact, it is estimated that 40% to 50% of all runners experience at least one injury on an annual basis [1]. The popularity of the sport has also risen dramatically over the last decade, with an estimated +40.43% growth in the number of people participating in marathons worldwide from 2008 to 2018 [2].

What type of sports data do you have? What types of improvements or moneyball type analysis are you interested in? Anything else we should know?

Marathon runners, by their nature, are determined folk and I wasn’t about to let my physical therapist thwart my aspirations to continue running. However, I was mindful of my form, which I theorized was the primary cause of my injury, outside of ignoring some of the aches and pains I experienced during training.  I adopted a form of running thought to attenuate risk of injury while optimizing efficiency [3, 4]. Some basic concepts of this form are:

  • maintain a high cadence of 180+ steps/min

  • limit ground contact time and vertical motion

  • display consistent and symmetric motion with left/right appendages

  • strike the ground with a slightly bent knee, forward of the ankle at impact

  • maintain a forward leaning posture from the feet to the head, forming a “straight” invisible line between the ankle, hip, and shoulder at impact

  • replace dorsiflexing + heel strike with a mid-front foot strike followed by a push backwards and upwards

Many of these aspects are feasible for the average runner but are difficult to accomplish by feel alone. Sports watches loaded with sensors coupled with chest straps provide excellent feedback by measuring various performance metrics but are limited in their ability to convey motion symmetry and running form. As a complement to the data I receive from my sports watch, I wanted a tool that I could use to provide visual feedback on my running form and facilitate quantification of relevant performance and form metrics. My colleagues and I at Xyonix were exploring some intriguing pose estimation tools that I thought might be able to help.

Body pose estimation is an advanced AI technology that automatically generates a skeleton for people in each frame of a video. The results can be powerful for sports improvement

as human motion and related kinematics can be tracked and analyzed. Pose estimation models identify pre-defined points of interest on a human body, such as joints and organs, which are subsequently linked to form a computer generated “skeleton” of each person in an image. Skeletal motion can be tracked across multiple frames of a video and translated to estimate body kinematics, which are used directly to assess running performance. The only hardware that is required is your mobile phone to record a video.  


Consider the following mobile phone video recording of my daughter and I running at a local park. Overlaid on each frame of the videos are:

  • Colored line segments comprising computer generated body skeletons as predicted by a body pose estimation model.

  • Trails of red, blue, and white circles representing the latest history of nose, right ankle, and left ankle positions, respectively.

  • Colored rectangular regions whose vertical span indicates the maximum range of vertical oscillation encountered in the displayed histories.

The isolation of a single human body against a backdrop of sticks, branches, trees, and fencing is impressive given that these components could mistakenly be construed as body parts. While the pose estimation model is capable of identifying multiple figures at multiple depths in a scene [5], we recorded only one runner at a time for visual clarity and to facilitate isolated analysis.

Visual inspection of the videos shows a striking difference between my running form and that of my daughter:

  • My daughter’s strides are more consistent and symmetric, as indicated by the left ankle (white dot) and right ankle (blue dot) trails. I tend to raise my right ankle higher than my left in the back stroke, which likely means that I’m not engaging my left gluteus maximus as much as my right. It also means that my right foot has a slightly longer distance to travel to prepare for the next step. This issue may seem trivial but the lack of symmetry can throw off my timing, which can lead to improper foot placement, which can lead to harsher impacts, which can lead to joint, muscle, or tendon damage. Over hundreds of thousands of steps, stride asymmetry can promote injury.

  • The average vertical oscillation of my daughter’s head, indicated by the trail of nose positions (red dots) tracked over time, is seemingly less than mine during a typical stride.


Below is a plot of the vertical motion of various body parts tracked over time throughout the video. These series are colored based on ground height, which was estimated via a “pixels to inches” conversion behind the scenes.

Screen Shot 2019-04-29 at 11.50.40 AM.png

With these data in hand, we are set to perform a more detailed analysis. Our goal is to confirm qualitative observations with quantitative metrics. We follow each estimated metric with a TAKEAWAY section, which identifies practical advice for future training.

Let’s begin with cadence, which is theorized to be positively correlated with running efficiency [6]. Cadence is defined as the average number of steps over a given time interval, typically given in units of steps per minute. We estimate cadence as follows:

  • Isolate left and right ankle series.

  • Remove outliers and perform local interpolation to refill gaps.

  • Smooth and detrend each series and use the result to estimate local minima, whose locations identify impact locations in time, I.e., step times.

  • Estimate the cadence as the median number of steps per minute.

Screen Shot 2019-04-29 at 11.53.52 AM.png

My cadence is estimated at 169 steps/min while my daughter’s is 173 steps/min. These results match well with typical cadence estimates as reported independently by GPS running watches.

TAKEAWAY If we are to adhere to the advice to run at a cadence of 180+ steps/min, it looks like we both need to pick up the tempo a bit.

We also can use the ankle series to quantify the median difference in left/right ankle elevations on the backward portion of the running stride, a metric we will call median stride asymmetry:

  • Extract left and right detrended ankle series from cadence assessment.

  • Find local maxima.

  • Calculate elevation differences between left-right pairs and report the median of those differences.

Screen Shot 2019-04-29 at 12.00.58 PM.png

The results show that I have a whopping 5.29 inch median stride asymmetry while my daughter’s motion exhibits a more acceptable 2.34 inch value.

TAKEAWAY I need to focus on engaging my left gluteus maximus more and matching the height of my right ankle during the backstroke. Generally, I need to be more aware of my tendency to stride asymmetrically.

Finally, let us quantify and compare the vertical oscillation of the head using the following steps:

  • Isolate the nose position history.

  • Find local minima.

  • Detrend by subtracting a spline model fit using the minima from the original series.

  • Find the local maxima of detrended series

  • Recenter the series by subtracting the mean (optional, but lends to a nice visual display).

  • Calculate the absolute difference between successive minima-maxima pairs and report the median value.

Screen Shot 2019-04-29 at 12.01.10 PM.png

TAKEAWAY My daughter does a slightly better job in using her energy for forward propulsion, wasting less of her energy in vertical motion. This is something to continue to monitor over time but likely will naturally diminish by lessening my stride asymmetry,  promoting a stronger forward lean from my toes, and by increasing my cadence.


We have demonstrated the use of pose estimation for quantifying three running performance metrics.

Screen Shot 2019-04-29 at 12.01.26 PM.png

These metrics, combined with the visual feedback obtained from pose estimation videos, have given me something solid to work on in future training sessions. I look forward to recording and processing another video to verify that my efforts to adopt a better running form are working. I can validate these improvements quantitatively via the proposed metrics with a goal of making me a more efficient runner.

While we have demonstrated the efficacy of using pose estimation to better one’s running, the story doesn’t end there. Pose estimation can be used in a wide variety of important applications. Here is a list describing a few imports applications of body pose estimation [7]:

  • Sports

    • kinematic analysis of tackles in American football

    • golf swing dynamics

    • tennis swing form

  • Assisted living

    • help future robots correctly interpret body language

  • Intelligent driver assistance

    • automated estimation of driver inebriation levels

  • Medical Application

    • detection of postural issues such as scoliosis

    • physical therapy

    • study of cognitive brain development in young children by monitoring motor functionality

    • identifying disability onset via correlated motion

    • behavioral understanding

  • Video games

    • avatar movement

    • gesture and motion recognition

Screen Shot 2019-04-29 at 12.01.55 PM.png

I am happy to report that, since adopting a new running form, I have been injury free for years now and feel less tired during training runs, which may be a testament to the efficacy of the technique. Even better, my children have taken an interest in running, achieving some pretty lofty goals at their young ages while spending precious bonding time with their dad. Pictured is one of those happy moments, where one of us won first place in their division. Can you predict which one of us was the victor?

Have sports performance data? Want to automatically recommend performance improvements to athletes? Contact us, we’ve taught machines to understand all kinds of video imagery and other performance improvement data — we might be able to help.


  1. Fields, K.B., Sykes, J.C., Walker K.M., and Jackson, J.C. (2010). Prevention of running injuries. Current Sports Medicine Reports, May-Jun;9(3):176-82. doi: 10.1249/JSR.0b013e3181de7ec5.

  2. Marathon Statistics Worldwide,

  3. Danny Dreyer and Katherine Dreyer, ChiRunning: A Revolutionary Approach to Effortless, Injury-Free Running (May 2009). Atria Books.


  5. Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh (2017) Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. Computer Vision and Pattern Recognition. Accepted as CVPR 2017 Oral. arXiv:1611.08050.



Understanding Conversations in Depth through Synergistic Human/Machine Interaction


Every day, billions of people communicate via email, chat, text, social media, and more. Every day, people are communicating their desires, concerns, challenges and victories. And every day, organizations struggle to understand this conversation so they can better service their customers.

Consider a few examples:

  • A communication system enables a famous politician or star to communicate with thousands or millions of constituents or fans

  • A product or service review system like Yelp gathers free form reviews from millions of people

  • An email system automatically conducts conversations with people after they fill out a form, stop by a booth, or otherwise indicate interest

  • An insurance company records millions of audio transcripts of conversations regarding a claim

  • A trend prediction system scans social media conversations to predict the next flavor food companies should plan for — in the past it was pomegranate, what will it be in 6 months?

In each of these cases, there is a need to automatically understand what is being said. Understanding a direct message rapidly can allow a system to elevate priority, compose a suggested reply, or even automatically reply on someone’s behalf. Understanding a large number of messages, can allow a politician to make sense of their massive inbox so they can better understand their constituency’s perspective on a given day or topic.

Understanding a large number of reviews can enable a surgeon to easily understand exactly what they are doing right, and where they should improve, or help a product manager understand what aspects of their products are well received and which are problematic.

Understanding the conversation begins with understanding one document. Once we can teach a machine to understand everything in a single document, we can project this understanding up to a collection, thread or larger corpus of documents to understand the broader conversation.

The anatomy of a single document is shown below. In it, we see a template for a document. A given document could be an email, a text or social media message, a blog post, a product review, etc. Typically a title or subject of some sort is present. Next, some document level descriptive information is often present like the author, date of the document, or perhaps a case # if it is a legal document. Next we have the body of the document, usually paragraphs composed of multiple sentences. In addition to the document content shown below, usually documents exists in a context — an email can be in reply to another email or a social media message can belong to a discussion thread. For simplicity, however, we’ll focus on a single document, and leave the inter-document discussion for later.

Screen Shot 2019-03-14 at 10.33.42 AM.png

Typically, much of this information is accessible in a machine readable form, but the unstructured text is not easily understood without some NLP (natural language processing) tailored AI accelerated by tooling like that in our Mayetrix SDK. From an AI vantage, there are multiple types of information we can train a machine to extract. Sentences are usually split by a model trained to do just that. We might use a sentence splitter trained on reasonably well composed text, like news, or, we might train a custom sentence splitter for more informal discourse styles like those present in social media or SMS. Next, individual key phrases or entities like specific people, places or things, might be present inside a sentence. We have multiple options for how to automatically extract phrases and entities, typically a combination of knowledge and example trained machine learning models. We also often extract sentence level insights. These might come in the form of categories a given sentence can be placed into. These might also come in the form of grammatical clause level information (think back to seventh grade grammar class), such as a source>action>target structure like LeBron James [nba_player] > score > final shot. Finally, there are document level insights we might extract, often assisted by the more granular information extraction described above. Document level information might include, for example, the overall sentiment, or a summarization of the document.

So how do we build AI or machine learning models for each of these types of information to extract?

Much like a toddler learns the word furniture through shown examples like a chair, sofa or table, AI text analysis systems require examples.

Before we can gather examples, however, we need to decide what exactly we are going to label. This might be easy in some cases, like building a spam detector — all email messages are either spam, or not spam. But in some cases, we have a significantly more complex task.


Consider for example the case where millions of constituents email their congressional representatives thoughts and opinions. We can further presume the busy congress person receiving thousands of emails a day wishes to understand the key perspectives worthy of their response. We might find through an early analysis that constituents are often expressing an emotion of some sort, an opinion on a topic or piece of legislation, requesting some specific type of action, or asking questions.

An initial task is to simply understand what the broader conversation consists of. In the chart below, we see that much of this conversation might consist of feedback, a question, or an emotional expression.

Screen Shot 2019-03-14 at 4.51.13 PM.png

These broader high level categories, might prove insufficient. We might ask what types of questions are being asked, and for some very important questions, we might want to know exactly which question is being asked, for example, where can I buy it? One approach we regularly use at Xyonix is to employ hierarchical label structures, or a label taxonomy. For example, for the referenced political corpus above, we might have a few entries like this:

  • suggestion/legislation_related_suggestion/healthcare_suggestion/include_public_option

  • question/legislation_related_question/healthcare_question/can_i_keep_my_doctor

  • feedback/performance_feedback/positive_feedback/you_are_doing_great

These hierarchical labels provide a few key advantages:

  • it can be easier to teach human annotators to label very granular categories

  • granular categories can be easily included under other taxonomical parents after labeling has commenced, thus preventing costly relabelling.

  • more granularity can result in very specific corresponding actions, like a bot replying to a question

Generating labels is often best done in conjunction with AI model construction. If AI models perform very badly at recognizing a select label, for example, it can often be a sign that the category is too broad or fuzzy. In this case, we may choose to tease out more granular and easily defined sub-categories.

In addition to defining labels, we also of course need to get to actual examples that our AI models can learn from. The next question is how do we select our examples since it is costly to have our humans label things? Should we just choose randomly, based on product or business priorities, or something more efficient? The reality is that not all examples are created equal. If a toddler is presented with hundreds of different types of chairs and told they are furniture, but never sees a table, then they’ll likely fail to identify a table as furniture. A similar thing happens with our models.

We need to present training examples that are illustrative of the target category but different from those the model has seen before.

This is why setting out arbitrary numerical targets like label 1 million randomly selected documents is rarely optimal. One very powerful technique which we use regularly at Xyonix accelerated by our Mayetrix platform is to create a tight feedback loop where mistakes a current model makes are identified by our human annotators and labeled correctly. The next model then learns from its prior mistakes, and improves faster than if only trained using random examples. The models tell the humans what they “think”, and the humans tell the models when they are wrong. When our human annotators notice the machines making many of the same mistakes, they provide more examples in that area, much the way a teacher might tailor problem sets for a student. The result overall, is a nice human / machine synergy. You can read about our data annotation platform or our annotation service if you wish to see how we label data at Xyonix .


Once we have sufficient training data, we can begin optimizing our AI models so they are more accurate. This requires a number of steps, like:

  • efficacy assessment: comparing how well each of the tasks above perform on a set aside test set (a set of examples the trained model has never seen)

  • model selection: selecting a model architecture like a classical machine learning SVM or a more powerful but challenging to train deep learning based model

  • model optimization: optimizing model types, parameters and hyper-parameters, in essence, teaching the AI to build the best AI system.

  • transfer learning: bootstrapping the AI from other, larger training example sets beyond what you are gathering for just your problem. For example, learning word and phrase meanings from Wikipedia or large collections of Twitter tweets.

Finally, once models are built and deployed, there is the next step of aggregating insights from individual documents, into a broader understanding based on the overall conversation. At Xyonix, we typically employ a number of techniques like aggregating and tracking mentions across time, or different users, or various slices of the corpus. For example, in one project of ours, we built a system that measures the overall sentiment of other surgeons to a surgeon who has submitted a recent surgery for review. Telling the surgeon that 44% of their reviews expressed negative sentiment is one thing, but telling them that their score is 15% below the mean of peer surgeons is another, more valuable insight. Surgeons didn’t get where they are by being average, let alone below average, so they are more likely to move to correct the specific issues mentioned.

Understanding conversations in depth automatically is a significant endeavor. One key we’ve found to being successful is by looking well beyond just AI model development. Considering what the labels are, how they are structured, how they will be used, how you will improve them, how you will get training examples for the models, how the model’s weaknesses can be improved — and perhaps most importantly, how you will do all of these things over a timeline, with AI models and the product’s using them always improving.

Have a corpus of people communicating with you, each other, or someone else? Having trouble automatically understanding the conversation? Contact us, we’ve taught machines to effectively read all kinds of content for all kinds of customers — we might be able to help.

Drones to Robot Farm Hands, AI Transforms Agriculture


Swarms of drones buzz overhead, while robotic vehicles plod across the landscape. Orbiting satellites capture high-resolution multi-spectral images of the vast scene below. Not a single human  can be seen in the sprawling acres. Today’s agriculture is rapidly revamping into a high-tech enterprise that most 20th-century farmers could hardly recognize. It was only 100 years ago that farming transitioned from animal power to combustion engines. In the last 20 years, the global positioning system (GPS), electronic sensors among other new tools have moved farming even further into a technological wonderland. And now, robots empowered with artificial intelligence can zap weeds with extraordinary precision, while other autonomous machines move with industrious efficiency across farms.

It is no secret that the global population is expected to rise to 9.7 billion by 2050. To meet expected food demand, global agricultural output needs to increase 70%. AI is helping make that goal possible (1). It is clear a change is coming as farms are seeing an 86% decrease in labor force just in the U.S., while the number of farms continue to rise (2). While today’s agricultural technologies and AI capabilities are evolving at a rapid rate, this evolution is just beginning. Factors such as climate change, an increasing population and food security concerns have propelled the industry into seeking more innovative approaches to assure an improving crop yield.

From detecting pests to predicting which crops will deliver the best returns, artificial intelligence can help humanity oppose one of its biggest challenges: feeding an additional 2 billion people by 2050 without harming the planet.

AI is steadily emerging as an essential part of the agricultural industry’s technological evolution including self-driving machinery and flying robots that are able to automatically survey and treat crops. AI is assisting these machines in interacting together so they can begin to frame the future of fully automated agriculture. The purpose of all this high-tech gadgetry is optimization, from both economic and environmental standpoints. The goal is to only apply the optimal amount of any input (water, fertilizer, pesticide, fuel, labor) when and where it’s needed to efficiently produce high crop yields (3).

shutterstock_1160507371 (1).jpg

With AI bringing all components of agriculture together we can discuss how autonomous machines and drones are driving driving the future of agriculture. A future where precision robots and drones will work simultaneously to manage entire farms.

Autonomous machines can replace people performing laborious and endless tasks, such as hand-harvesting vegetables. These robots use sensor technologies, including machine vision that can detect things like the location and size of stalks/leaves to inform their mechanical processes.

In addition, the development of flying robots (drones) gives way to the possibility that most field-crop scouting currently done by humans could be replaced. Many scouting tasks, such as scouting for crop pests, require someone to walk long distances in a field, and turn over plant leaves to see the presence or absence of insects. Researchers are developing technologies to enable such flying robots to scout without human involvement. An example of this is PEAT, a Berlin-based agricultural tech startup; PEAT has developed a deep learning application called Plantix that identifies potential defects and nutrient deficiencies in plants and soil. Analysis is then conducted using machine learning and software algorithms which correlate particular foliage patterns with certain soil defects, plant pests and diseases (4). The image recognition app identifies possible defects through images captured by the user’s smartphone. Users are then provided with soil restoration techniques, tips and other potential solutions with a 95% accuracy.

Another company focused on bringing autonomous AI machinery to agriculture is Trace Genomics which focuses on machine learning for diagnosing soil defects. The California-based company provides soil analysis services to farmers. The system uses machine learning to provide clients with a sense of their soil’s strengths and weaknesses. The system attempts to prevent defective crops and maximize healthy crop production. According to the company’s website,

after submitting a sample of their soil to Trace Genomics, users receive a summary of their soils contents. Services provided in their packages range from a pathogen screening focused on bacteria and fungi to a comprehensive microbial evaluation (5).

These autonomous robots combined with drones will define the future of AI in agriculture while AI and machine learning model are helping ensure the future of crops starting from the root up.


It will take more than an army of robotic tractors to grow and harvest a successful crop. In the next 10 years, the agricultural drone industry will generate 100,000 jobs in the U.S. and $82 billion in economic activity, according to a Bank of America Merrill Lynch Global Research (6).

From spotting leaks to patrolling for pathogens, drones are taking up chores on the farm. While the presence of drones in agriculture dates back to the 1980s for crop dusting in Japan, the farms of the future will rely on machine learning models that guide the drones, satellites, and other airborne devices providing data about their crops on the ground.

As farmers try to adapt to climate change and other factors, drones promise to help make the entire farming enterprise more efficient. For instance, Descartes Labs, is employing machine learning to analyze satellite imagery to forecast soy and corn yields. The New Mexico startup collects 5 terabytes of data every day from multiple satellite constellations, including NASA and the European Space Agency (7). Combined with weather readings and other real-time inputs, Descartes Labs reports it can predict cornfield yields with high accuracy. Its AI platform can even assess crop health from infrared readings.

With the market for drones in agriculture projected to reach $480 million by 2027 (8), companies are also looking to bring drone technology to specific vertical areas of agriculture. VineView, for example, is looking to bring drones to vineyards. The company aims to help farmers improve crop yield and reduce costs (9). A farmer pre-programs a drone’s route and once deployed the drone leverages computer vision to record images which are used for later analysis.

VineView analyzes captured imagery to provide a detailed report on the health of the vineyard, specifically the condition of grapevine leaves. Since grapevine leaves are often telltales for grapevine diseases (such as molds and bacteria), reading the “health” of the leaves is often a good indicator for understanding the health of the plants and their fruit as a whole.

The company declares that its technology can scan 50 acres in 24 minutes and provides data analysis with high accuracy (10). This aerial imaging combined with AI techniques and machine learning platforms are the start of something that is being referred to as “precision agriculture”.

Precision agriculture (PA) is an approach to farm management that uses information technology to certify that crops and soil receive exactly what they need for optimum health and productivity. The goal of PA is to ensure profitability, sustainability and environmental protection. Since insecticide, for example, is only going to exactly where it is needed, environmental runoff is markedly reduced.

Precision agriculture requires three things to be successful: physical tools such as tractors and drones, site-specific information acquired by these machines, and it requires the ability to understand and make decisions based on that site-specific information.

Decision-making is often aided by AI based computer models that mathematically and statistically analyze relationships between variables like soil fertility and crop yield.  Self-driving machinery and flying robots able to automatically survey and treat crops will become commonplace on farms that practice precision agriculture. Other examples of PA involve varying the rate of planting seeds in the field according to soil type and using AI analysis and sensors to identify the presence of weeds, diseases, or insects so that pesticides can be applied only where needed. The Food and Agriculture Organization of the United Nations estimates that 20 to 40 percent of global crop yields are lost each year to pests and diseases, despite the application of millions of tons of pesticides, so finding more productive and sustainable farming methods will benefit billions of people (11).


Deere & Company recently announced it would acquire a startup called Blue River Technology for a reported $305 million. Blue River has developed a “see-and-spray” system that leverages computer vision, a technology we here at Xyonix deploy regularly, to discriminate between crops and weeds. It hits the former with fertilizer and blasts the latter with herbicides with such precision that it is able to eliminate 90 percent of the chemicals used in conventional agriculture. It’s not just farmland that’s getting a helping hand from robots and artificial intelligence. A California company called Abundant Robotics, spun out of the nonprofit research institute SRI International, is developing robots capable of picking apples with vacuum-like arms that suck the fruit straight off the trees in the orchards (12). Iron Ox, out of San Francisco, is developing one-acre urban greenhouses that will be operated by robots and reportedly capable of producing the equivalent of 30 acres of farmland. Powered by artificial intelligence, a team of three robots will run the entire operation of planting, nurturing, and harvesting the crops (13). Vertical farming startup Plenty, also based in San Francisco, uses AI to automate its operations, and got a $200 million vote of confidence from the SoftBank Vision Fund earlier this year. The company claims its system uses only 1 percent of the water consumed in conventional agriculture while producing 350 times as much produce (14). Plenty is part of a new crop of urban-oriented farms, including Bowery Farming and AeroFarms.

Agricultural production has come so far in even the past couple decades that it’s hard to imagine what it will look like in a few more. But the pace of high-tech innovations in agriculture is only expanding.

Don’t be surprised if, 10 years from now, you drive down a rural highway and see small helicopters flying over a field, stopping to descend into the crop, use robotic grippers to manipulate leaves, cameras and machine vision looking for insects, and then rise back above the crop canopy and head toward its next location. All without human being in sight.

So what is in store for the future? Farmers can forecast that in the near future their drones and robots will have the AI capabilities to communicate about everything from crop assessment, counting cattle, monitoring crop diseases, water watching and mechanical pollination.  

Have agriculture data? Multi-spectral aerial imagery? Operational farm data? Need help mining your data with AI to glean insights? CONTACT us -- we might be able to help.















Helping At Home Healthcare Patients with Artificial Intelligence


Until very recently, a caregiving parent possessed few at home health tools beyond a simple thermometer. Then, as the internet developed, so too did online healthcare sites such as WebMD, offering another very powerful tool — information. At home health tools continue to rapidly undergo massive changes, and now it’s AI leading the way. Today a parent can look inside their child’s ear and receive help treating an ear infection, or an elderly person can conduct their own hearing test without ever leaving the house, and often, with intelligent machines operating behind the scenes. Increasingly smart at home health devices are evolving through the rapid proliferation of AI and the increased embrace of digital medicine. These new tools include devices like smart stethoscopes that automatically detect heartbeat abnormalities or AI powered otoscopes that can look in a person’s ear and detect an infection.

Imagine a world where at home AI healthcare tools get smarter and more able to heal you every day. These tools are incredibly data driven — where they are continuously collecting data off your body, about your environment, your nutrition and activity — and then these algorithms are continuously learning from this data

not just from you, but from millions of other patients and doctors who know how to make sense of this information.

These AI tools will then deliver personalized healthcare tips and remediation throughout your whole life. Perhaps one day without you having to set foot in a brick and mortar hospital.

AI can help wherever the care provider is identifying patterns, for example whenever a physician identifies the acoustic pattern of a heart murmur, the visual pattern of an ear infection image, or the contours and shapes of a carcinogenic skin lesion.

What if AI could help you or a doctor predict a deteriorating heart condition? “If you can go to the hospital and say, ‘I’m about to have a heart attack,’ and you have proof from an FDA-approved product, it is less costly to treat you,” said author and ABI Principal Analyst Pierce Owen (1). Other at home healthcare tools are becoming smarter everyday with tools such as EEG Headbands that can monitor your workout and vitals, Smart Beds and devices such as EarlySense that detect movement in your sleep and give you detailed data driven reports on a variety of vitals and how much sleep-and deep sleep-you are actually getting or smart baby monitors that allow parents to monitor newborn vitals. (2)(3)(4)

One significant way AI at-home healthcare is taking off is by helping parents with young children.

Parents can never get answers quickly enough when something is wrong with their child. So what if they never even had to drive to the doctors office?

According to the National Institute of Deafness and Other Communication Disorders (NIDCD), 5 out of 6 children experience ear infections by the time they are 3 years old. That’s nearly 30 million trips to the doctor’s office a year just for ear infections in the U.S. alone. Additionally, ear infections cost the US Health System 3 billion per year.


This is where companies like Cellscope step in. A pioneer in the otoscope industry, Cellscope has had success launching it’s otoscope, Oto Home. Oto Home is a small smartphone peripheral device that slides onto the users iPhone accompanied by an app. Once inside the child’s or patient’s ear the app's software recognition feature called the Eardrum Finder begins to direct the user to move and tilt the scope to capture the visuals a physician will need to attempt a diagnosis. After the session, the user enters some basic information about the patient and both the recording and the information is sent to a remote physician who reviews the data and if necessary can prescribe medication.(5) This same image used by the remote physician, can, be used by an artificial intelligence system to assist the physician with a diagnosis. The use of the AI system can decrease the costs of more expensive tests, in addition to identifying more refined possible diagnoses.

AI in healthcare can now also detect heartbeat abnormalities that the human ear cannot always initially detect. Steth IO captures exactly the premise of what the company’s goal is: “see what you cannot hear”. One study found that doctor’s across three countries could only detect abnormal heart sounds about 20% of the time.(6)

By using thousands of various heartbeat sounds, our Xyonix data scientists trained the Steth IO AI tool to “learn” how to tell which sounds are out of the norm. After the system takes in the encrypted and anonymized heartbeat recordings, it sends back a classification like “normal” or “murmur” to help assist the physician in their diagnosis.

Since patients can see and hear their heart and lung sounds, patient engagement is also a bonus for physicians. Steth IO also differentiates itself from other emerging AI healthcare tools by integrating the bell of the stethoscope directly into the iPhone so there is no need for Bluetooth or pairing and it displays all results in real time (8).

While this is currently only operated by physicians, as the at home healthcare space rapidly grows, we expect to see similar heartbeat abnormality detection abilities tailored for at home use so that you can check the health of you and your loved ones.


Virtual AI driven health care systems are also quickly making their way into people’s homes. Take for example HealthTap, which brings quality medical service to people around the world who lack the ability to pay. How it works: patients receive a free consultation via video, voice, or text. Then,

“Dr. A.I.”, their new artificial intelligence powered “physician”, converses with the patient to identify key issues and worries the patients is having. Dr. A.I then uses general information about the patient and applies deep learning algorithms to assess their symptoms and apply clinical expertise

that attempts to direct the user to an appropriate type and scale of care. (9)

Dr. AI isn't the only new AI that can give you healthcare advice from the comfort of your home. CareAngel launched its AI virtual nurse assistant, Angel. Their goal is to reduce hospital readmissions by continuously giving medical advice and reminders between discharges and doctors visit. Healthcare providers can also use angel to check in on patients, support medication adherence and check their patient’s vitals. (10) Ultimately this AI technology strives to significantly reduce the administrative and operational costs of nurse and call center outreach.

In a world where healthcare is meeting resistance from rising costs, we can see that the emergence of innovations in AI and digital health is expected to redefine how people seek care and how physicians operate. The goals and visions of most emerging health companies currently are simple:

allow new suppliers and providers into the healthcare ecosystem, empower the patient and provider using real-time data and connection and take on lowering general and long-term healthcare costs.

While healthcare has always been patient centered, AI is taking patients from a world with episodic in clinic interactions to more regular, on demand and in home care provider / patient interaction.

Trying to make your medical device smarter? Need Help with Your AI Needs? CONTACT us -- we might be able to help.