A significant challenge in teaching machines to automatically analyze, understand and glean object related insights from video is how to efficiently and accurately prepare large amounts of examples used to train and evaluate models. With frame rates around 30 to 60 fps, accurately labelling objects in even small time spans of video can be extremely time consuming and expensive.
Today, we have the pleasure of introducing you to Vannot—an open source, web based, and easy to integrate video annotation tool we created to help efficiently annotate objects for use in machine learning video segmentation model construction. Vannot takes advantage of the relative similarity of nearby frames to enable efficient object annotation in a web context with geographically distributed labelers.
We took inspiration from some of the industry’s most venerable drawing and illustration applications, and reframed them in close consideration of the workflow processes involved with annotating a large amount of video data. It is easy, for example, to advance a few frames or seconds and carry over the most recent shapes and annotations, so that all you have to do each time is make a few small adjustments. More advanced features are available, as well: it's possible to group adjacent or disjoint shapes into the same instance if, for example, an object is composed of many parts or is obscured behind some interloper.
Have a look at the video below to see Vannot in action. In this video, Vannot designer and sailor Clint Tseng, walks through the preparation of sailing related training data like hull, jib and main sail object segmentation.